{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# CPSC 330 Lecture 15"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Lecture plan\n",
    "\n",
    "- 👋\n",
    "- **Turn on recording**\n",
    "- KNN for supervised learning (15 min)\n",
    "- T/F questions (10 min)\n",
    "- Intro to NLP (5 min)\n",
    "- Break (5 min)\n",
    "- Word counts, TF-IDF (10 min)\n",
    "- Word embeddings (15 min)\n",
    "- Useful software package: spaCy (10 min)\n",
    "- General thoughts on feature engineering (5 min)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Learning objectives\n",
    "\n",
    "- Apply the $k$-nearest neighbours algorithm for supervised learning (regression and classification).\n",
    "- Distinguish between KNN for supervised vs. unsupervised learning.\n",
    "- Apply TFIDF as an alternative to `CountVectorizer`.\n",
    "- Explain the concept of a word embedding, and why it might be useful.\n",
    "- Perform basic NLP operations with the spaCy library."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "\n",
    "import matplotlib.pyplot as plt\n",
    "\n",
    "import scipy.sparse\n",
    "\n",
    "from sklearn.linear_model import LinearRegression, LogisticRegression\n",
    "from sklearn.model_selection import train_test_split\n",
    "from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer\n",
    "from sklearn.neighbors import KNeighborsClassifier, KNeighborsRegressor\n",
    "\n",
    "from plot_classifier import plot_classifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.rcParams['font.size'] = 16"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import nltk\n",
    "from nltk.tokenize import sent_tokenize, word_tokenize\n",
    "from nltk.corpus import stopwords\n",
    "from nltk.stem.porter import PorterStemmer\n",
    "from nltk.stem import WordNetLemmatizer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [],
   "source": [
    "import gensim\n",
    "from gensim.test.utils import common_texts\n",
    "from gensim.models import Word2Vec, KeyedVectors, FastText"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Announcements\n",
    "\n",
    "- hw6 released, due next Monday at 11:59pm\n",
    "- Golden Rule violation in Lecture 13 - see https://piazza.com/class/kb2e6nwu3uj23?cid=440\n",
    "- Midterm grading almost done\n",
    "- Today's lecture is a bit of a grab bag of topics\n",
    "  - I am still working on the ideal storyline for Lectures 13-15 (but I know it's out there somehwere!)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## KNN for supervised learning (15 min)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Classification"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Here is some toy data for binary classification.\n",
    "- I want to predict the point in grey."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](img/scatter.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- An intuitive way to do this is predict the grey point using the same label as the next \"closest\" point ($k = 1$)\n",
    "- We would predict a target of **1 (orange)** in this case"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](img/scatter_k1.png)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- We could also use the 3 closest points ($k = 3$)...\n",
    "- We would predict a target of **0 (blue)** in this case"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "![](img/scatter_k3.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Going back to the cities dataset:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [],
   "source": [
    "cities_df = pd.read_csv('data/cities_USA.csv', index_col=0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "cities_df_train, cities_df_test = train_test_split(cities_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train = cities_df_train.drop(columns=['vote'])\n",
    "X_test = cities_df_test.drop(columns=['vote'])\n",
    "y_train = cities_df_train['vote']\n",
    "y_test = cities_df_test['vote']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "knn = KNeighborsClassifier(n_neighbors=100)\n",
    "knn.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "#??KNeighborsClassifier"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEQCAYAAABBQVgLAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAABahElEQVR4nO2dd5hU1dnAf+/M7NKRDnERG6hIVZAiYEFFFEQUJcZYSPKFWBMxmmikCIKaWNAYSyyJBCxRUMHYRRRQAQHdpakgSlmll6Vsm5nz/XHu3Z2ZnZmd2TLb3t/zzDPsreceds973y7GGBRFURQlHp6qHoCiKIpS/VFhoSiKopSKCgtFURSlVFRYKIqiKKWiwkJRFEUpFV9VD6AyaN6gvslo0riqh6EAprCQ9KYN+SnQuqqHoihKKWzPXrXLGBP1j7VWCouMJo2ZPeriqh5GtSYnP5/b3v+IB4cMpmm9epV6L/+ObFbfMo/ln22r1PsoilI+Hrrj6E2x9qkZqo4yM3M1y7J/Ylbm6kq/lwh0fWQE45s8Ven3UhSlclBhgX3LHvvmO+Tk51f1UFJCTn4+L2at4Q3ghaw1lf7c3tYZ+NpksHX+Snqf3q5S76UoSuWgwoLUvmVXB2Zmrma4MQwFhhmTsud2NYxrR6TkdoqiVCB1Xlik+i27qnGfd0IgAMDEQCBlz+1tnYEIyPixNGzdstLvpyhKxVHnhUVVvWVXFe7zdnR+7khqn9sVGKPfGsVdA75KyT0VRSk/dVpYVOVbdlUQ+bwuqX5uV2BkT39CNQxFqSHUaWFR1W/ZqWZm5mr6B4N4ge9DPl6gXzCY0uf2ts4A4Na8aSm7p6IoZadW5lkkgvuWvTTKW3bfrDVc1aNrpecfpJof9uwlKz2NM2Ps9+zZm9LxiNfL1vkrGcNYXhk2h8M7d6f0/oqiJE6dFRaRb9kuoW/ZN/TpVUWjqxweGHpuVQ8hDG9LG0br35HN6LdGqcBQlGpMnRUW1e0tuy7ja5OhAkNRqjl1VlhUt7fsuo4rMC7vv5sZ86p6NIqiRFKnHdxK9UK8XmT8WMYsG1vVQ1EUJQIVFkq1wduyHb42NkpKs7wVpXqhwkKpdrhZ3iowFKX6oMJCqXaElgVRgaEo1QMVFkq1JFRgaGlzRal6VFgo1RZv64yixD0tba4oVYsKC6Va423Zrqi0uaIoVYcKC6Xa49aRGrNsrGoYilJFqLBQagS+NhlFGoZWqlWU1KPCQqkxuBrG6LdGqYahKClGhYVSo3CT9oauurtqB6IodQwVFkqNw9cmw5Y2Vx+GoqQMFRZKjcT1YZwv71b1UBSlTqDCQqmxeFtnkD39Ce3lrSgpQIWFUqNxe3mrwFCUykWFhVKjcbO8s6c/oWVBFKUSUWGh1Hi8LdsVlQVRDUNRKgcVFkqtwBUY2dOfqOqhKEqtRIWFUmvwtrRhtGOWjVUNQ1EqGBUWSq3C16bYh6FlQRSl4lBhodQ6XA1Dy4IoSsWhwkKplbhlQbS0uaJUDCoslFqLKzDGLNP2rIpSXlRYKLUatyyIjB9b1UNRlBqNCgul1uOWNtcIKUUpOyoslDqBWxZEs7wVpWyosFDqBN7WGWGlzRVFSY4qERYicqGILBSRgyKSIyLLRWRwyP7mIvKsiOwSkUMi8qGIdKuKsSq1C9fprRqGoiRHyoWFiPwOmAusAC4BLgdeBRo6+wWYBwwFbgZGAWnAAhFpn+rxKrUPV8PQHAxFSRxfKm8mIscAjwC3G2MeCdn1Xsi/RwADgcHGmAXOeZ8D3wN/An6firEqtRsRm4OxdtgcDu/cXdXDUZRqT6o1i18DQSCeDWAE8KMrKACMMfuBN4GLK3d4Sl3BjZAa/dYoLQuiKAmQamExEPgauEJEvhMRv4hsEJEbQ47pAqyOcu4aoIOINE7FQJXaj+u/GHfC/CoeiaJUf1ItLI4EOgEPAPcDQ4APgH+IyB+cY1oAe6Ocu8f5bh7twiIy1nGUL9+bm1exo1ZqLW7RwTHLxqqGoShxSLWw8ABNgN8ZY54xxnxkjLkeeBe403FuC2CinCvxLmyMedoY09sY07t5g/oVPnClduJt2a5Iw7i8v/ouFCUWqRYW7l/jBxHb3wfaAj/DahAtopzrahTRtA5FKRcS91VEUZRUC4s1Mba7f6pB55guUY45GdhsjDlYGQNTFEVRYpNqYfG6831+xPbzga3GmG3YHIsMETnT3SkiTYGLnH2KUuGYaIZPRVGKSGmeBfA2sAD4p4i0AjYCl2Ed3b9yjpkHfA7MEpHbsWanO7Hax99SPF6lDhDYmQ3Aq5+3pNhSqihKKCnVLIwxBhgJvAxMBv4H9AN+aYx53jkmCAzH+jWewGojAeBsY8yWVI5XqTuYqU9rcp6ixCHVmgXGmBzgRucT65g92AS+X6dqXIqiKEpstOqsoiiKUiop1ywUpTrh35Fd1UNQlBqBCgulzrP6lnksn7etqoehKNUaNUMpdZ6uj4zQUh+KUgoqLJQ6ja9NBiK2+uy1I6p6NIpSfVFhodR5vK2twJDxWkxQUWKhwkJRCO9vcdeAr6p2MIpSDVFhoSgOvjYZRSXLVcNQlHBUWChKCN6Wti+3ahiKEo4KC0WJQDUMRSmJCgtFiYJqGIoSjgoLRYlBqIbR+/R2VT0cRalSVFgoShxcDaPrIyNUYCh1GhUWilIKbo/uro9o1p5Sd1FhoSgJ4AqMMcvGaqa3UidRYaEoCeKWBpHxY6t6KIqSclRYKEoSuJneqmEodQ0VFoqSJKphKHWRWi8scvLzGfvmO+Tk51f1UJRahGoYSl2j1guLmZmrWZb9E7MyV1f1UJRaRqiGoZneSm2nVguLnPx8XsxawxvAC1lrVLtQKpzQarWqYSi1mVotLGZmrma4MQwFhhmj2oVSKYRqGJq4p9RWaq2wcLWKCYEAABMDAdUulErD1TA001uprdRaYeFqFR2dnzui2oVSubgaRtdHRmjxQaXWUSuFRSBowrQKF9UulMrGbdGqxQeV2katFBa7c3PpHwziBb4P+XiBfsGgahdKpeIKjK6PjGB8k6eqejiKUiH4qnoAlUGBP0BWehpnxtjv2bM3peNR6h7e1hkEdm+r6mEoSoVRK4VF+yOaMHvUxVU9DEVRlFpDrTRDKUp1Yev8lYxZpkl7Ss1HhYWiVBLelu2KSptre1alppO0sBCRDBF5WESWi8hGEenqbL9FRPpW/BAVpWYT2p5VtQylppKUsBCRLsAq4GrgR+BoIN3ZfTTwhwodnaLUEkK1jHEnzK/i0ShK8iSrWTwErAOOBS4FJGTfZ0C/ChqXotRKVMNQairJCouBwP3GmIOAidi3HdAsJEWJg/oxlJpKssIiGGdfKyC3HGNRlDqDr00GvjYZZE9/QhP3lBpBssJiGfCrGPtGA5+WbziKUrcQr5et81eqhqFUe5IVFvcAF4nI+1gntwHOFZEZwCXAtHgni8hZImKifPZFHNdcRJ4VkV0ickhEPhSRbkmOVVGqPd6W7cL8GIpSXUlKWBhjPgFGYh3c/8I6uO8HBgEjjTFLE7zU74H+IZ9z3R0iIsA8YChwMzAKSAMWiEj7ZMarKDWBUD+GahhKdSXpch/GmLeAt0SkI9AG2G2M+SbJy6wzxiyJsW8E1pE+2BizAEBEPsfWAvwTVtAoSq3D1TAaDpvD4Z27q3o4ihJGmTO4jTEbjDGflUFQlMYI4EdXUDj32g+8CWjBJ6XW4m1pgwk1SkqpjpSqWYjINclc0BjznwQOe0FEWgH7gPeAO4wxm519XYBoNcTXANeISGMndFdRah2+NrZabfb0J+h9yzyWf6aVa5XqQSJmqOcjfnbzKyTKNoB4wmI/NrHvEyAHOAX4C/C5iJxijNkBtAB+iHLuHue7OVBCWIjIWGAswJGNG8UZgqJUb7wt2xHYmU3XR0bQZerTzJhX1SNSlMTMUMeGfAYBW4F/AmcBnZ3vp4EtWF9DTIwxXxpjbjPGvGmM+cQY8wjWkd2WYl+EUDLhz90e79pPG2N6G2N6N29QP4HHUpTqi7e1rScl48eqSUqpFpQqLIwxm9wPcDvwsjHmBmPMQmPMN8739cB/sQ7opDDGrAS+BU5zNu3BaheRNHe+tXORUifwtmxX1KL12hFVPRqlrpOsg/sc4IMY+z5w9peFUG1iDdZvEcnJwGb1Vyh1CbdFq4wfWyszvU0wyLovX2fWY1fy5NQzmfXYlaz78nVMMF6xCKUqSFZY5AO9Y+w7DShIdgAi0hs4AXBzNOYBGSJyZsgxTYGLnH2KUqdwTVJb56+sVRqGCQaZO+sWPnh9Ftuzx3H44Ptszx7HB6/NZN6scSowqhnJ5lm8AtwtIgHgVWzxwLbYUh+TgOfinSwiL2DzJVZiI6FOAe4EsoHHnMPmAZ8Ds0TkdqzZ6U6s9vG3JMerKLUC1+nN+LGMAZ7v83RVD6ncrPriJTav+4hC8y3FNUg7UVg4jE3rB/J11jw69xxZhSNUQklWs/gjVkjcB3yHjUr6DrgXK0j+WMr5q7F5FP/GhszeArwG9DXG7AIwxgSB4Viz1hPA60AAONsYsyXJ8SpKygkaw5vffsdlsxcwcMabXDZ7AW9++x1BEy1uI3G8rTOKMr2rg4aRl7ufN565krzc/WU6f9mHTyDmIF6eiNhTn8LCO1mx6JXyD1KpMJIt95FrjLka6z8Yg33jHwOcbIy5xhiTV8r59xljuhtjjjDGpBljjjLGjDXG/BRx3B5jzK+NMS2MMQ2NMecYYzKTejJFqQKCxvD79z5j0sK9rNn1ALtzP2XNrgeYtHAPf3jv83ILDKDIh1HVtaS+WvgMW7/7jMxFzwDJCY+83P0UHMhmLoY0HsYaGkLpxoH92RU+ZqXslCmD2xjzrTFmpjHmb873txU9MEWpiby1fiOfba1Hrn8ptqxZJ2AUuf5lfLo1jbc3bCz3PUI1jN6np66FTKgwyMvdT9biZ3kDQ9bi59i/dwv//fuFbPnu0yLhEY+vFj7DCGzc/EUE8fJwxBGraHJERmU8hlJGkm2r2qG0T2UNVFFqAjOyNpPrnwhE5vqkk+s/mwmfrKkw05QIdH1kRMo0jFBN4quFzzDCBBkKDAsG+eDFm8nZu5WfA1mLn4urXbiCZqoTAHkvuRHaRR5paffRa9DoSn4iJRmS1Sx+wDqo430Upc6y7dBBoHvE1iBwFfApef6nKsw0FaphVHbiXqgmkbnoWTIXPcPd/nwAbvPnsWPLl8wF3gHOCwTiaheuoOno/NwRuIjDeDkGOAWvtzMdOrbnpO7VwDGjFJFsNNSvKZld3RIYBhyH7XehKHWWdo0aszs3C2t+cnkJ2AAspljj6ESufxifbu3D2xs2MrzT8WW+p5u4N/6cU5l64LoyXyceoZrEhYFCvjGmaLF/E9u3YCh2IWgWyOeDxc/RY9Bvqd/giLDruELneUfQuNyL4U0KCTAOeNg+VBkwwSBfZ85lxeJXObA/myZHZNBr4OWc1ONixFPmuqkKyTu4nzfGzIj4PGyMOQf7l3Bc5QxTUWoG13bvQAPfZCA01uM54M+UNE3VJ9c/ieezNlMeXA1j6/yVlZK45y7wriYxNehniwmwD2s4egyY4hw7HniB2NrFVwufYUAwgJdwc4QXGEAALxsIBJawef0Wvs5KLq2qtLyNoN+vCYDlIOl+FnGYhQ2JHV+B11SUGsWwTsfx3sZtfLa1D7n+SUA3YD0lTVMu3dh2sGKKEriJe+PPeapCNYxoZqMLsQLCYLWJ0H3xtIuc7d/wbb3G9AXy83IIBhsA9Yr2p7GKQFHo7PSk8iy+zpzL5g3ZFBYsIlSDKywcxg/rB/LCk6PZu1MoLLgD6M7hg1l88Np9fLvqY0ZcNT2q5qGaSjEVKSzaUPLVSVHqFB4R/n7+6by9YSPPZ93OtoMHOVRoyPNHmqZcVtGuceMKubfbD2Pr/JWMYWxCiXvjmzzF1vkrWR2jHHoss9FErPgzwKrIawL9gN7+QjIXPUPfIbcV7RtyTbG28eTUMzl88H2iz0v00Nl4i/eKxa86gqCkBucvvJMdP94E5nsiBUloAqAJBln31et8+sGTHNi/FxMsxDbqPBVox+GDm3j7v/ezYvFMrrz+FTy+ilxCqzfJRkOdEeVzrojcAjwILKqUUSpKDcIjwvBOxzN71NksvvYippzRJYppCiCPBr7JjOlesUGErtO7IkxS8cxGA8VLB+ffkftOAb7weFi99IWYkVE2NDYrxp1XkXv4YJipqDQzkxUusTU4jI9ogsRNADTBIG/M/APvvvoAOXsbY4JPAV8AT2GLbdcDPgT+wfbsPF588so6ZcJKVix+TEkHt+uJ+gS4vrwDUpTaRnTT1Coa+CYzoH0hF3aseFef68MYw9iYWkMkvU9vR9dHRpAx7gaypz9B+3NO5YwQs1EkBfkHwHjoW79J1Os1bdiMXbu+L6FduPQaeDkfvHYfhYXDCF/E84CpmOCdbM8+ushU1KnrmTHNTJvWD6RB40ZY4RNdg4OjYjy51WK+zpzLpm9XYUx7YGHYPaxxrT+wArgSGMaOH3vXqZIkYpII2xORsygpLPKATcaYatPSq2ubVmb2KO3AqlQMQWN4a/1GZmRtZtuhg7Rr1Jhru3dgWKfj8CQYtRM0xjFNbWbbwYO0a9yYMd07cGHHxK9RFvw7smkfJ0qqYeuWjH5rFADtzzmV7I9WYgwcNfgUgj36M+3TnmW6b17ufv5zb19eKczl5+kNufrOJSUio1xNYfP6rRQW3okrROF+bG3RmVjjRx5paQOp3+gwB/ad5xyzGegA/Ab4BfA6TZvfSe7BphQWhkadAeQh0htjzgUewYYyv4QNPNgMNKFp80PUb9CaHT8exhrSRkV5qjnA48BHRT+3zZjOVTe/WKY5qo48dMfRK4wxUYvFJqVZGGM+rpARKUoNwS3fYbOyHwC6szs3i0kLJ/P+xu08en7/hBZ71zRVnhDZsuBqGNdOJWrHvVvzprHVOS77o5V4W9u2ruURFECJpL1o2oV4PFx81SN8nTWPFYums+OntZjg0di2OVdQbCWvT2Hh7RTu+y3wKbbKUHesFnE/8DYwgcKCAjp0as/m9QPDhE9a2n20aNOI3dsX4fcfBv4PG8r856LrHMq5h4M53wBNiGvKYnPYz3WpJElSwsKpNtvfGLMsyr5ewDJjjLeiBqcoVU14+Y6Kz5FIBSLA+LHcNe6GqAJAvPZP1tu6YsprRDrFp/jz6BUj70I8Hjr3HEnnniMdh/crRDcjzcWakRYD6RRrBnuwb/q7aXJERpjwKXKAD7qGE7sOZ96Lt/LDN10IBJoDnxH6/xkIDMN2S8gnvimrQ9jPNbUkSaxAgXgkG/sV7xXKS/R2qIpSY4ldvqNiciRSgZuHkT39iYQzvd1jy1LdNlqoratdhBJZeDC+w/sjbLBuOjYb/lHgRuBd4B/AFuy7LHTuOZKrbn6R68d/wlU3v0jnniPx+HxcfNUjNGraCLiLaP+fxkwGdmC1lZLBCDAVq5XYn301tCRJvEAB8MR880lIWIiIR0RcjcHj/Bz6aQRcAOwq/6MoSvUhevkOl7LnSOTk5zP2zXfIyc8v/eAKws30Lk1guO1cf3zkiaSr20Ym8LlM8eeVqBkVWbW218DLSUu7j5IL9WHn053ibPiFhBZqhBXs3uGPm8gnHg/+wkPE+//0eHyIZGOd2XOwHZ/nAL2AI7AhtHPwpQ3kmE5H1ciSJOH5KMVzaH096U1jnVeqsBCRSUAhtguewRoNCyM+OdjQ61fL+RyKUq1o16gx8cI7Q3MkkhEAMzNXsyz7J2Zlrq6YgSZAqIYxvslTNGzdkq3zV8Y8NrT2VMPWLRO6R7xQ2/4Bf1g589CqtXm5+zmpx8V06NSetLSBhC7UXm8XrEaRRbxseH8CPTBKC9dt/bOTGDr6TzRtfhjxXId4+tG0+UR69DuVNkcW0rDxBbTNmM6QUdfETOSrzphgkMXvPx0zHwWOjPlAifgsPna+BSsQnsMGHYeSD6wF/pfIgBWlpnBt9w5MWjiZXH/J8M7IHIlQAXBDn14xr5mTn8+LWWt4A/hl1hqu6tGVpvXqxTy+onGd3qOdiB83mS8et+ZNYyqlZ4XnxAm1BWi37RsgtgM8ms8h93AjcvYGsWagHOJpBqU5nOOF69pKt9fQuedITj7l0lKftabg+ieWL3qF3du/IRAoJPYcNoh5nVKFhTHmE2wOBSJigGeNMXUnBECp0ySaI5GMAJiZuZrhxtiF0phShUtlUOTUTkBQRMvZcENuI8NyQzO0YxHLAd7giJ+RteytIiFx1rBbOKnHxTx179nuqLHVqAZjTSdu2Kz7Mly6w/mkHhfzzaoFUSOmjq5hZqVESpEUhSdvyC4qc2K7YMdy4ufGvF9SeRY1Bc2zUCqSRHIkHl+2gpzM1cwIBLjG66VZj65RBUBOfj4XzPwvS/1+OmKt7319Pt65+ucp1S7KQmBnNsaAmfo0r34eXVgkwpL3HuSYRU8zK8Sv8XPx8Ia0pSD4GNAV21H5FbzeQpA0An6An2Gd06Fhs24+RgFpaQM5b9Q1pSbJmWDQ0V5eCYmYGs1J3UfUGLNSdCGQVST0XBPZui9f54PXZ0UkMr6AzTcJ3QbWV9TcGJMbdRJKFRYi8i/gHmPM986/4z6DMeY3pT5pJaPCQkklyQiAUKHiEk+4VDdcgeGSrLBwk/VWFuYWRUuBnbNuNCKPzcBNhOdB3ARkA8spubgNAAbh9X7MsSceWyP9CGUhuhAAG6U1kCGO0Jz591+w48dbCU8ydPurrMMmILrJkFOBrIAxgagWp0R8Fmdj49TA6n/xpEvtU1MUpRRcs1JYqGgU85JrqloaIigAJgYC9K0C30VZcHMxArvLVrAh0gFedF1gAH4+5jcEyCa83EYhMJnoDtm/AGMZcukEOp8ysk4ICoDli16JWzTxnVeuY8WiV9i9Yy1wMlabcDPWOwC/wkZ2XYeN8uqAFSB/irmGJ+KzODbk38ck9USKUstJRgDMzFxN/2Aw6kLZLxisEt9FWfG2bEdg97a42eHRiOYAzz28D7tgeUnjQwJ0xJqi3EXtW+JnVQdZ+dlLnNSj5piRysu+3d8Tb05MsDHbs8cBE4DzgXaEZqxbE14j7DwvcM6bAzbqNSrJZnCfAaw0xpQILndyLXoZYxYmc01FqWnk5Odz2/sf8eCQwUkJgB/27CUrPY0zY1zXs2dvJY+8YvG2bEdgZzaMHwsJlEOH6A7wWY9dyfbsPwCvYxey8dhF7SvgD1izSbys6h5sz97Fi0+O5pc3zq7WAiPR/hilHef3B4g/J8diTU85wENEL4zYG4p+G/MQmYQxwe2xxp5s1dkF2GyVEuU+gJOc/VruQ6nxhAqESNNQaIhsMgLggaHnVuKIqwZv6wz8O8oXHNlr4OW8N/tPBAKtCfdLLAPaY4tZ349d4CJ9FvcD44B6bM++sVpXgY3mlI7WgCmR44LBg8C9xJ8TsM7/WCa8yVhBMgeYRJsjG7M9mz2xxp+ssIhX7qMeEIizX1FqDLFyJiJDZGtCFFNFEkuI3jXgqzIXHjypx8V8OPchAgE32c6tCvsn4O/AJcAHwCDgDoodspOAHtiigxuANFYseqVcwiLRcNSydM+L28nv2wHMnzeRbVs2snfXRgoKDJh7nWf3ENmoyeZDNIkyJ26E2BXO9TcT34S3FriRthntufL6F5k+Pnads0QyuI8RkcEiMtjZ1Nv9OeQzDPgj4SUZFaVGEioQXshaE5aRHS1HojqQqvIh0TLPxesle/oTCWd5Q3hdKPF48Pm82EXNjdR5FLs8dXe+Z2Hflh/HNnV9CNgC/IhdHH8O1CdnX2S+cOKU1lzJBIMECgp49oGzefu/U9ievZrDB/PZnt2I92bPKDomFrE7+aXj9zcha+mXbM8eR0H+J2D+gQ0fvtqZEwht1NS0WQtszsmNzpz0x0YzjQNmYIXtYGAvNq/ihZDruKzCl9aQC6/4C7+8cXapXf8SMe5di20P9QE22ukx5+cPQ7a/CVwE/DWB6ylKtSaWQHCFyATHmT0xECghTKqKVJQPiSVE3cS+0W+NSrhQYWhdKBMM4ktriLXBh9Z+6kR4aQ4T8v09tgrtTcA7WD9HQ4KBQJm716376nV++CaLwoJGWGfw74A8CgsXsmn9FtZ+9RpP3nsmOXsbAU9ju+g9BmwhENjMd+u+ZF3mGyWua4JB1n35Ojt+Wkv0t/yXgIMY8wVWk1iGFQBuRd2bKV7obZb6gPOuR2SHc9yN2LLtACOBaygutLjEmZtHCBc8thDikFF30LlnYlFkiZihnseW/BBn5DdidZdQ8oFvjTEx7V2KUhOIjG4KjWpKNES2qsb8BpVbPiRe5rmvje2DkT39CcYArwybw+Gdu6NeJ7Qu1M8XP8fW7O84lJOLfTM+guLaT7/BmlUuILwHxQ/YN+XwMuMwDH9hv1L9FtHMSKcOuIz5cx8kEGiLXZzD+2UUFv6ZT/53M/l5rQj3qxR30TNmNx/NfYjOPeziG/T7mf/mRFYtexdjAlhL/T+A6YS/pz+HXezdirrhvTasue0qrHa1isZNjwSEtHo+CvLWY5fk/c7cdQZaRp0bq31MB44pU8Z6IqGzm4BNACJyNrAiWjSUotQGYgmE51ZmMnvN19UyRyIV5UPiCVH3uV0NI7Azm3EnzGfazp5RrxVaF+pCfyGvrf+SQHADViC8S/Hb9y+At7Dhnc0oXgAHE6vMeCAwgRWLpscUFrGcx+/PuYdAIIDtl9HQOTp0kV1G7uF8YjuLxwOPkZ+3k6+z5nFi1+H88/5zOXywAfBPihf+idi2rC9SLDA2UbKirnuP47ERTXdghQDs2ia8P2cGgcD0kOtOwJqcfDHnxo7x/2jQqBlnXzQu6Yz1ZDvlfZLM8YpSk4iXM3HqqrX0N6ba5UgksohXBElpVR5vTA0jsi7UPcFC3mAPNrx/FjZRzA0J9YRsC10A4zttD+zPjumENsZEdTIHAq5QeAO4EB+X4Wc2VkiNx2o9O7BazWBKtnXthq2v+ldWLJrO1u+XOYJiBSXf8HthfQs3Yp3Sh4leUdf132zA+i+6A/8gGJwPfB7lur2xfpx4Dm1o2uzIMgUBJBsNhYicj037O5GS4ssYY6p32zBFiUFpORPLvV7OTIv+J1NVORKpMI0lm3nuahj+HdmMfmsUGSEd+qI1RroIYQ734mElfm4gPEzWA0T2FOlAvByDJkccGUN7uBdkE/7CoyhO+nMXe/fN+3F8rMXDR3h5mABTsItsDvY34QVK1qd6GxiBNSM9xPbstWzPXov1EaRHjK8+tonT/2G1iyAdOnZm84absEX8HsKGv/6C6JrGKuKHwv5f3LkB2L+3bEEAySblXYh1Zn+Izat4F6uzDcDqUovKNApFqQaUljNxeru21SpXIlXlQ8qaee76MSRrCdCzhFbhci95vMkjBPHj4RQ8bMFPf4rrFjUhfAF0fRnRy4y3O+pk1q5cF6E9HI/f/wp2IXaT/kIX+5nOvb7HxzJex3AJDxPgVuwiWw/rUI/mCxiEdYa3xwaFute+FyswZhLuo+jmfGfg8Wzhp015WF9G5Jh+pGTvjtJCYT3APVHnxs3a9hcexASDSScvJqtZTMC638dhC7aMN8asFJETgPewYQmKUiNJhSAIGsNb6zcyI2sz2w4dpF2jxlzbvQPDOhVXsE2UVJUPqajM81h1oQ4BXgp5HbiERwnix0tfAvwWW5KiCdYM5C6Ari8jPMfAddr+tPm7KCGqLzl3jWYWGgS8DNTDR5ARWH/KRQR5jb8R4AOshjCB6G/0dwA3EN3x7V77ypBzVmGX3j8QDD5EMLg4xnnRBEN8rcoe/x2ECdvQ/Is/4S+8uUzJi8kKi5OwHpogNn7NB2CM+VZE7sbOZvxWVYpSRwkaw+/f+4zPttYj1/8A0J3duVlMWjiZ9zdu59Hz+yclMFJVPqSihGj0xkiGw4f3MRxDP8BHIa8Cl/A/ApyCfSNehjWvhAqHSxC5jfR64/D6vE6Z8Ws4qfsIp/9F5CIbu8OeveZjwA58bGGaE6J7L7m8yV8JcCnwZZRrunTDCpNY136cYmGRhxV8bYifXX0HNirLFQxBrAnsG+wyGy9zewJWo3mcYt/KOMqbvJissAgCfmOMEZGdzijc0h8/Yl33ilJjqcg3/0jeWr/RERRLCX2TzPUP49OtfXh7w0aGd0r8T6g6mcSikZOfzx8/Wcr4nAPM37WEMy5/lPoNjgg7Ji93P/+a0pOpxvAocCk4b/VeXmMoARYDfbDho5uxC+0m6tVvzOARf6Rzz5JZ002OyODwwci379LMN1m0btWac/d66WibZzj+lCCv0Z4Au4n/Rn9UnGuvxxZDdN/wtwN/A+4uZUyHnOe+AGt6+wRoiw2RHYCtuOtqDnc7/74CeBY4GpvpEH2spXUUjEayFbe+AY5x/r0cuEVEfiYirbHGuh+SHoGiVBPcN/9JC/eyZtcD7M79lDW7HmDSwj384b3PCZazUdiMrM3k+icS7U0y1z+J57NqVwGEmZmr+WLHbu5evop1KxYU9d8OxTq8Da2w7/YTnO33Ukgaj2AX1Z3AzaTXe4i2GY248Iq7uXHiQk4+9ZKodvdeAy8nLe0+7Nu2+0aeS7ze202OaEPe/i1McQSFy71AOo9iTTj3OtcMJQ+7oPeJce1MrHN8AFY7+gG7hI4G4vd3t/u3Y5s+fYr1iXyOFQJ/xGoOF2CjqrZhM9s9FPt0oo31fqBPqR0Fo5GssHgBm/UBdoa6YOPFtmHjySYmPQJFqSaEv/mPwr5FjiLXv4xPt6bx9oaN5br+tkORUT2hdGPbwZqdvhRacsR1vs8CMnfv4w1g7SdPkJe7v+h41+E9FatVXAQRUVIBvDwACB1P7s9Nkz7jqptfLDXj+KQeF9OhU3t8vgHYZWk6cBmxFvu0tPvIyOgQ5k9xP7bPhqFB+qtYE05vbOG9b53v3th85cVRrn0Ym2F+FPAU1gjzFDaB7khsQ6epUcdkl9fLsQ7147A5Fq4ZzYM1a32E9U88jhUkzzrn/wL7uzsoYqyDgOPx+RbRa9DomPMXi6SEhTHmcWPMn5x/r8DqPddhDWI9jTGzkx2AiLwrIkZEpkZsby4iz4rILhE5JCIfiki3WNdRlPJS2W/+7RrFf5Ns17hxua5f1YSWHHFDetdgC1i4CYNNfij+M3cd3oew5QLvirjeveSRzhy69j4rqQ544vFw8VWP0KV3F0R2Y0NPe2M1lF6ELqBpaQM5utNR+BBW1GtM34Yt6NuwBaem1ec0rKj5EmjRtBnHdT4Lr/cwdoE/H9ckBn6sBT5ycT4BkTZYI0zxy4d1sh+FddxnEy6AXsWKyoPA/7BRVjuxdaDimaz2YbWYOVihdgn2Hf63zlM8BAzA51vPMSd0KFOv8aTzLEIxxmwFSu/QHgMR+QW2bGTkdgHmYYuy34xNTbwTWCAiPZ37KkqFUtlv/td278CkhZPJ9Zd0TjbwTWZM9w7lun5VEllyJGgM8wMBhmCrEwFMDAbpO2sOv3nkYYKHAkUO7zMKDtPXnxe9e54IuU3alCoocg/t47WnL8cvzcjZu5FgMI2APxfbr+1cbPCmW3xwKvADIl7OvWRCCb+H2/r1C6AVcDHw5f6fuPr61/lh/cesWPQKOfv2EgzspLDgeILBZli/xBkh9yjElyb4C+8mXsa3FQQDsNrBJuxSdxTWSOOG0t4EtMb291hGeMe732Cd682cGfu98/NB4EqaNHuHho3aOImJy+k16Noy9xovVViIiBv5lAjGGJOQABKRZlj9cBw2OyWUEcBAYLAxZoFz/OfY36U/YWdEUSqUdo0aszs3thMz1pt/ok7xYZ2O472N2/hsax9y/ZNwnZMNfJMZ0L6QCzseVxmPlRT+YJBpi5fwxje7yA8UUM+bzsgTW3HXwH744iwwoSVHzg8E+Ibi6qKRCYOz/9yRu17dBE4jpPf/81u++uELegcK8BfmEwwG8Xg8NPAY0j0eunu+IS93P+/Oup6hVz1ZwklugkFeffJy9u/6lgAdCHA09t0ytLZSV+xbvQe4FcjDmF6IR0osnKGJg5OBpcBJ/gKyPn2WvkNuo3PPkUU9sIPBRdjF+WWsGWg7UEiPfkNY9+WHwDH4ODckG9ylOOPbCoqPsFb++5w7hobS5jhjvgnruI7seLcF2x2iCfAANifkMdLSPmXQ0JsqrL+HmFKcdk5IbMKePWPM5IRuLPI0cLwx5hwRMcA0Y8x4Z99zwFBjTEbEOTOAs4wxR8e7dtc2rczsURcnOmRFAeDNb79j0sI95PqXUfLNvw9TzmxRIlopPBy2+G3QFQCR4bBBY3h7w0aez9rMtoMHade4MWO6d+DCjuWPtiov/mCQs2e+xa7cNthlsrieUasGO1lw9bCoAiMnP58LZv6XpX4/HbFGkD7YZXkJxcICZ193YMgVt9Kx5x9ijuWuAV+RPf0JADLG3cCl93/K5289wGmDb6LvkNvCjs1a+gKLXh/P6wS5BA95ZGPbiLrkYU1E4wjPd5hD24zpXHVz8buqq1WsLMyllTP2WdiiGwVpDbjmL0up3+AIp7vfOKwAisRed/eOjVDYHw+vE2C8kw3u8h983IKf97EmreZYJ/zfo1wziPVZtMdWuo0Mme2NdZyPwAqsAXi9+zn2xB5Jme8AHrrj6BXGmN7R9iVSSPDuhO+UICIyEFtHt4QJyqELEK3W8hrgGhFprMUMlYqmLG/+yYbDekQY3un4pEJkU8W0xZ87gqJkctmu3F5MW7yESWecXuK8aCVHjsEGeUYzLZ0FfPjqY7Q/cUwJLQFgfJOn2Dp9JSJgDHz9t0fJ/Ohz3sDwy0+e4K8N19A0PY3n+zxNXu5+Fs+bzAi89CNISww/8TBB/hZyxWj5DuDWkQolNHFwMjabYSg2zmipP5/MRc/Qd8htznnx61OJ5OPljYhs8GZAHj7uwMNevPyVAB2w6WkDY1zTg12q7yZ2mY+xWD/FadSrn8Pgi/9YVP22okh5s1oRScOWYXzQGPNNjMNaYI13kbgl0JtHue5YEVkuIsv35kZGFyhK6XhE+Pv5pzPlzBZ0aX07LRsMpEvr25lyZouYCXO1KRx2zte7iJ0kNoU5X+8ocU5kjw+X9tj4oB7Op2fI9zogI1BI06XXRx3HlrOvK/q3r00GL6z/gYvy862THHj5x+LChMvn/wMTyGcahTwK7MI4Ibf7Iq7ajZK92VaVCCHN2f4NK+o15rQGzfgnxaG8E4EtJsiebPsOa8+LHazQqElbfP5DjMA4eSOFeLkL64Duj48dzAXSeA2b/X08tpjiaKy4HUx4wyI/8R3cftpmPMqFV/yOGycu4uRTLq3wXuTlcnCXkT9jewJOi3OMEN30FVNPN8Y8je1IQtc2rcoXEK/UWZJ9869N4bCFwULiPYvdH06skiOPAjd6PCwyhi8drWMD0A8bXbQL6PvBEubdvZ/XPw7XLtrv/gr3fT8nP5+XNmxmqdPQyK17dcWRLbl08a/455KFXA5FeRpvAJdTSCH3RmgXq7AOYZc8YAK9Bl1HKEMcP8qS9x7kmEVP09GpY9UKaInQtI31Z/UaeDkfvHYfhYUlgxV8vnvI378JE8wvWuTupYA3eYIAC/HRlhF4GUqAixBeYxMBrsKG00arW/UMVieL7U9r2rx9mDmtMkipZiEiHbARchOAeiLSzHF0E/KzF6tBtIhyCVejqJoSn4oSQe0Kh3UXpGiscvaHU1RypH69Ep8vRDgywjw1HGuVd53d/3tyKmOWjWXMsrE0bN2Shq1bkj39CcTrxds6I2ZV3Zd/3M2/vv4O/H4mQlGexlCsMcZqF5vwcS42hHQSdhF2w1r7A9l88Np9zHjkcl6Yfh65h/YBsH/vFr78+HHuDil4+CiwE0PWp/8iL3d/US5HWtpAIkNxj2h6kILD+7icyLyRNLxsx8d8plEAwL0ESONB4GtsSG1oiO0ibJ+5rlhxNYFoORkidzPgvLEx/t8qjlId3BV6M5GzgAWlHHYKNtppiDGmfcT5zwNnq4NbqS6UxSleXTn936+yN78D4T4LcJ2ozetv5rMxlyd0rVCndytsSpxgY3/Oxwaa7gL6er3878IzaZKWRk5BIX9e+hV/7duTFu2PKeE4d9kA9HQcGmdhTRSDsTFErgZjq0r1wEMWAdoRoCNW2G3BGsk2YBfkmxz/wWsc0epErr71XV775+V03LTcmimwsUiuUehyoOugsQwYdheBggJemzGGzd+tw4bmptH+mE7s3rQcrwnwJdGc+8JQDK+FbB+Nh9e4lACvRpnJOVjTYBrFgsItm74KmMDxJ5/IxUk6smMRz8Gdap/FV8DZUT5ggw7Oxs7pPCBDRIpqpIlIU+zLw7wUjldR4jKs03Gc3r6ABr4+hL5hNvD1SWk4bGj2dFm5vX9PSiaJuVnK2fypf8+ErxVqnpqM9V8swv6R98Eud16gnzG8lL0bX5sMXvpxN1/s3FPkk4g0cYVmVfdzGlGtAc4BhhD+Fn8h4COTuRjS2Ab8Gusq/Ss2t+EYbOR+a3y8x1zg4K71rPzs3+zavJK1WB9Ln4bNOMNXn/ModnRvWvcBQb+fp/92Ppu/24+1fq8AnuanH9YjJsCZEDMbPDKb5l6CpPE2Jf0sUBxiexI26e4urKP+QmxOxw8VJihKI6U+C2PMPmw/7zBsDh6bjDEfOz/PwxZBmSUit1OclCcQZohUlCrFdYrbcNjbqywcNjR7uqxlyS8+sSMfbNzGwi0/ETA3Yt9k6+OVQs48qg0jTuhY2iWKcM1Tg4xhd34BDbBFwi/FLjpLPB5mp6cBtjputD7i0arqGgP78/Npgl0MPsbmZN8Tcf9pWGt/P+AifLzG3QTwYf0W52EXXg8+HiwqST4cH2+/P52R3jRm+fO50lef73uNZs2SmUwNuW6vnO188PqdHD7YkPBQ1takkUN7rBAb7Gw9AATS7DFSmEdDoiQfUsjHRc2WQlmFtci7PTGupDiiaw6+tJtTIiggxWaomIOIyLNwtrUAHgRGYv83PgduNcZklnY9NUMpFUllVqKtCFxzzQt+P7/0+Xjn6p+XuelRReeBPL5sBZ9+mUUPY/g3cKUITbp3Yd2u3Tw4ZHDROB9ftoKczNXMCAS4xuulWY+uUYXe48tWsOfLLF4whmuAjdhg1Mei3PsmoC92ae1GI/LY6hz9LdZncR/1uZVVHCoyX3XHakGnOj/39Pi42OPlhRD/xZW++swO+igMPk9xTsQ+fHQljWyyKGl+6pXekIxj+7L9+2UUFuZhTBC7+Lvf4OdC8sIMJ3lYPWwzNs/iaIo7+xUAvenR71TOHRlWKalclCvPIhUYY0r8Fhpj9mB1x1+nfkSKYqnoHhSVQWj2dHlbqlZkHkhOfj6zMlfjMaaoyc0UYzh11Rryg3acV/Xoyi3vfsjXO3exLKSPeM8vs7j4pBPIaNok7HovZq1hqfOCOxEbiut1vkNJw+YzH0FoqXH3zd1GRvm4ixGOoMA5biS2ItOpWJeyBP1MDoZXop3iz+N1oJBjsELiMoKchI9sBhA9t6RfoJAlP64jp7AJmObYkh6nYZuO7sFWtc3Gmv1Cy47vwUZD9aS4+94zwE4aNclj8PC7S/lfqDiqhbBQlOpKvKS7xVtOY+riz8nanl9lGkdka9WKbqlaHmZmrubYYJAehPsTLgga6gEvZK0hz+9n5U/bGSUS7nMwhgkffcK/Rg4Pu15kZNSlEVpITn4+Z//7RVZgwt7ubSOjhwkU9fi+AR+3lIjfn4LVOX6PjYAK9T24WLMRfMz9CCfi4SM8zMcLRb6OIyh2CB8A/MEAhQcOAf8mPDS2G9Yk9iV2Of6tc3ZTbNXa9djO1VDcRa8XHTq2YNSYGXh8qVvCU56Upyg1idhJd+nkBVry8hpvpfS+SJRYoaWzMqMVQEgdrlaxyZgS1WSnYfsvnxkI8NKqtTQAJkfM173AV9t2kJ1zoOh60ZL/JgYCvJC1psixPzNzNQMwMZzLBXg5FTgBHz8wkOglyV0H/Ers+30v4NS0+pyaVp/uzr+X++rh4118TGeuc79jsbrBBcBe8ZLbsAW5DZsT9NWj0DQkjx2UDI39Dhsf1hDryG6Ktb63xLpnXUHhYhMk83NNSgUFqLBQlLjETrp7CTiIiSg/XVG9LxIh0QW0KpiZuZp2gQD9iR4V1BfYZgzHGcPAGMcMAiZ89EnR9WJGRjn9xsE61peJh+40oRst6IyXbkB3PCwF0jgSmImPtSyhCd1pTnc8dMdXlGG+GJjh8fBlehomPQ1fwxYceVx/AgivAkHxcOWfl+BJ8zGCXIZifwM2YeOZpgEeXzqX/3EBTY85g/2B5uTxPLHbpy7D6/Xj9XZxtt/vXC1+OZFUo2YoRYlDeCXaIFZIPId953yK2GU+bq/0/IpY2dOhC2hZfRfl5Yc9e9nh8fBjMFjkTzDY+qxNnH/vwxb1XoU19+xzjmuGfYYAsH/7TnLy8xPuN/7A0HOLcl8K/Rfi4W/kcTQBHiK0QJ/1OoD1ETwOPEXLBgNZfO1FYdf177CL8nV7T+CU7z6zfqFgkC8XPomP/KJ+3ZOAudjA3PuACwrzmD/nNjZv2AumIfFLdWyhVbsTyT28n5y9E7AGsH3YeKpOFDu13Xf7kmVKUoEKC0WJQ3EPiguwbTE3YCvW3EZVl/lIdAGtCiL7g7sRW0ucBLvJ2Dqp/w455koRvgFWhJikrvF4mJW5Oql+48M6Hcf/1mfzxZa/8RqGS/iJAHdj7f2RyYb3Y6vRRs+297XJYM/WH1j9yeN4g0H2YR3cPT/9V4ny6yOxYufPwDQM3dd+SCHPA88Tv393Ib0G/ZwFbz6MTf3zY3M3Ist+zAQKSEu7j16Drkl4PioKFRaKEge3Eu3CzSdSGGyNbXNZH7ssJN/7oiJJZgFNhsoIFQ7VgjKxJT+WRhwzxRj6Yd+pmznbyuKw94jQrVVDjsoWhgYNIyjgf57vyDe9CJjQ7Of7sVFII2ng6xe1+VROfj5XfbKMDsEgS51x/x4bJRWZ2zERW5vqL8DtwGkYPucvFDLRuVc0YTWJthntOan7CBa/9w9yD23F/o7l4eMypw/GMKxL/VbS0hZzdKejytTprryoz0JR4uAm3bVtFMRmz7p/7L/BLgAla/XU5K53bqjwpIV7K9RxH1pDarDXSy9i+zKmENsnkQg5+fm8tGotdzvFB6cB9T2FTBjYjCbp1zt3mQpcgSsoYmXbP7viK7IPHOR7rCB4DPu/fkaM8Q+iON5pOSBk42UL0Xti96JtRmMuHfMsc5+7ChP04P6O+XgQDx/h5WHs79xf8Hhf4rxR1yTdo6KiUM1CUUrBI0KuP7Ii6y+At7ALwB1Ux653ZSHZ/hyJEqoF3f7uh2Ru2x5mPjtUUIjfWdxDs7tdkjGpxYoQ233oEEt+NTwk6fCvRUmHA4/K4Lr/vRuWKJiTn8/Lq9cxF1s4vB820ulNirt5Nyl5e9oAW7mAdN5hLnAJ0wmwGXiX4vaph+jRbwjnjJjC0g8eZut3nyFpTbG/Y/vw8UhEH4xu1G/QsMK63pUFFRaKkgAlW656sJWOXgam4pXvOalVi2rT9a6s2FDhB6hMx32o4MjJz+e29z8KW6TLQ2TeiUuoOSta0uHjy1aUKJcyM3M1w4O2FMhIrAlqIjaPwuf1YQJ+p8FOC4qNNIeAXNJ5nxHg9LI4zGv8jgDTgHqkpd3H0Z26c86IKeTnHyBr8bO8gWGU/wDwOT6+LSpBUpxM2KNKnNqhqBlKURLg2u4daOCbTLjZyQNcSgNfgPsH92T2qLMZ3un4GisooPT+HF/v2sPAGW9y2ewFvPntdwmZpeIVOQytaVURJBpiGzk+ty6VG3LsbnMrNU3EmqBaAaM8Hs7/5Z/wNM6gAMilHgd5ioO8QyEB53UiENLLwlBPXqNBo3NpmzE9zJTk9vseCowQL2lyOz6mM41c59xc0ngYn+8eeg0aXSFzVFZUWChKAlSX6rKVTWn9OQLm2KT9GLEEQnbOAZ5bmcksqLC8kB/27CUzLY0e2HDcwc53DyArLY3vo5izopVLiWrKwtEugkHmv/BXgnlut7vdePkNPs7lBApYi9VEQs+92JtOr74juermF+nc07Y7zcvdT9biZ4v6ZtwTLMRndjKC3Ig+GIdp3iyvSpzaoaiwUJQEKEvL1ZpIdA0KisNMbyeZBMRob+0u4z/6hAC2REZFZZ0/MPRcRnY5iUu8XjZjtYrNwEivl0u6nBQ1pDc0sdFNaJyVubpEsuN4rLA4BJwWDNDJX+iYigrwsB8vB9mCbQMaGSk1xZ9H1uLnyMvdX7TN1SpcwdAK8GKKcjdc7sWQn7OF/PwD5ZiZ8qPCQlESxC2yN3vU2Sy+9qIKMTsFjeHNb7/jstkLkjbvVAaxNChbMekEbASRS+l9xqO9tYPVKjK32Tfzx4BbypB1Hs28lWxWezQN4vxAgHaB6KVATgHO9nr5zIDrQbgXqAecQIDjIGZGev+An8xFtm1rpFYBJWtRxTq3qlAHt6JUEdWxom20/hz78vwEzHjgFkq+X8ZOQIxX5HD8R59wCdYBPAxb6TXZirnRengkk9Ueyxk+xRh6AAPS0zhQ6OeI9DQE4WCBbYXarlFD+hw6zAznvFbYjPRsbDZEFjZSyu9sB/Cl1SctrSHttn0DWK1iQDAQNs7QWlTGV5+09PC6UO65VYUKC0WpIiorTLW8RJYpv2z2AtbsOprohojYCYixQlif+GIlmdt2MNPZPh6rt7wXCHBeggl40ZolNa1Xr0RWuzFwqLCARmnpiISH4MYTLGd5POxp0Zwvt+3g0i6dMcAzKzOtUDh4iAlOmC9YjeAobAHCGdjs9Puweth3WL3s5+Jh9O0fU7/BEXb827/h23qN6UOQgryDeDw+vP48mjizXL/ZkVxxW2kdqFOLCgtFqSJSEaZaERSXPCmZgRwrATFeCGuP1etKOIAvwAYiJ1rTKlYPj0ifxOPLVvDMykyu6nJSiWvGK5diDORs31kkjPzBIA2wfZ23RQiYpdjMiTex2eePYRP4RgPDoaieVOaiZ+g75DYAhlxjTUpL3nuQ5Qv+UZTZ7jZh6pWzjbzc/UXCpTqgwkJRKonSymaUFqaaivpSieCWPPlsax9y/ZNIJAEx1lv7IWw71MjmoW7+QvM0X6kJeIn28IilfbjEK5fidu4bGghwfiDAQmM4D8jHmov6eL3kBwI0w3adcP0Uk7FmtX5YE5QbPjvFn0evxc/RY9BviwSA67d4A8PooJ9WzrEdKSlcqgPq4FaUSiBoDDe/+yl/WfAda3YF2J1rWLNrL3/6aDWjZ7+LPxgsNUw1FfWlEqEskWCh5T36eL2chg1jHQScTnQn7lkeD5d26cyEswfFzMuAxHt4xHKul0akk3yKMewBbgZeBJYA+SZIfxE+wfopvsGWAPkn4NaNvZRw7WloYS4rFvyj6D5fLXyGi0x40p9LtOipqqZa9OCuaLQHt1LVzP1mA3cu+Nppi9MOW/PHrSI6kS6tDnB1t2OZvGgvuf5llDTv9GHKmS2qhRmqPLjVZpc61WZHAB8DR9SrRzTffY92bTmmRXOeWZnJ2FN7cEOfXmFZ3kDY9Vw2AH1D+o9H3jdyfzxC+4G7XInVKJZgixxeAXzg9dLA52V/QSH1vB4K/AGGAU9ihcMSSvbi7un1MWb8SgD+c29fVhbmFo2vHzAf2/4I4HpvOofO/F1KtYtq34NbUWobj32xDkNDoDVwA7Ym0GagA3Ara3ZN4LHlGwgEDyMcjeEK4HpgTY2sLxWrbEekFjAPuMbrpVkUH4J7nQtm/jfMdBQa9WQgoWineNpHPH9IzAgprKh3+1VMBT4Q4cITO/GfrDUQDJKGza94FMKaPoWOsX/Az4oF/8DrrReWY9EROAc4w1efeiFRUFUdARWKCgtFqQS2HfJj30EbYd81/4xdbr4C/gC0JPvA3bjahjARn2cGnVo051c9jq5x9aWihbEmUqcp8i0/0nT03IpMZq/9ukh4nPazdqwtpYdHWe4bev+YEVJYkX8F9u2/p9/Py6vX8QYwKmg41TluJVZ/PMs59wDWf9HM+Xf66vc4eGAHz/vDzWzTgHc94VFT1QkVFopSKQSxkfc+bKNO18y0DGgPLCQ0XNYwDJ+nD7/qUfNMT7Ecycl28ovmuD511Rou8XiKhEez1i15ZNiQuON5fNmKMncQDI2QOlRQSGEwiMEm3TUB2mJ9Lj7s//Awp9DgcGw01EAgBxCKBb3BcDZWq9oA9NyXzUCRmJpHdXNsu6iwUJRKoF3jRmQfCAB3Eu6PeA6rZVTvcNlkiBXGmmgnP9eEdVLLlmGmo1aAMSasFEcijZDK00GwtA5/ru/jldGXcMnLc5ga0jPjHWANcIOvPlvO+C19h9zGkvce5JhFTzPL0SI6Aj8zQZZ60+lbL1qB8+plegpFhYWiVAK/P60jf/4ok5KhsZujbHOpPuGyiRIvjLW0Tn5uuY72TZqwNPsnVv64ja9CAm6iRRQl4neIVQId4Lb3P2LC2YPijinU9xLL93Hnhx9zYTDc5zASG9HkhsmedNrPyVr8bAlz0zsmSC+pvuamWGjorKJUAsM7HU+T9HRKhsZ2iLLNpfqEyyZKomGssc5dmv0Tc7/+lrmAGMMhrGnGbb06IeKcWDWe4t3D9aUkUg499JhYdaZuCQRYvWMnUyPHRnEZ82HBIB++PC6spEd1q/WULCosFKUS8IgwfmBn0jyTCK/gWvntWOP1j6hIki3aF+3cS7BZ0UOBC4GzvZ5SW68m2mY1suLtrMzVUavfxjr+2RVf0T8Y5BDWWZ3pjOFZYhcL7EuxdrFz80q+SG9I34YtSnxW1mvM/mpqboqFmqEUpZIY3ul43t+4PSLzOR2PbAF6ETRTqKh2rKHmk2iRSeUlWjZ6+0YB+pXRkTwzczVDgkHeBlY426YBbwWDzP7FZdyzYFGJ1quhJNJmNdSXcn4gwDdQwq8S6/jzgkFeXfM1jdJ8DPYHOBgIMNjrpaHPx978fOpjkwxdDmDfvBs4z381MNDj5VD/q6uls7osqLBQlEoiWgXXdo0bc023EwH4z6ribeVtx+oKiOdWZjJ7zdcxS1yUhVjVcX/Y/WvSRTizfvTrx1rQ3Tf4y4NBLiDSJwFPfrGyVH9HaUT6UqYYQz9s7aZojvLI49sGgxwGLu58Am9+vZ65wC9FGNzxOF5bsy5qwl0PID3Nx5ci9EtvigmaauusLgsqLBSlEoms4BrKiBM6RjkjeULNJ6NXrWUY8d+gkyVWddxDZhhBbx/+MiC5cF9Xq3iJYq3CZRrQc806rj/t1KSFXKR2FelLGU5xH+3IuQk9fh/wAjAXuGz1OkZ5vUXzOXvNOs4mesLdWR4Px3frwthj2rH6lnks/2xbUuOv7qjPQlFqOKHmkwuCQdo64ZzJOoNjYavjTiR2uG/s5keRuILtiGCQvsRo9GPg0SVfFPldEvXBFGlXKzKj+lLGYx3Q+wifm0jfy6NYP0o/bNHD0NBdsOEJg53PadiigmfWr8eq9OK2rV0fGUHv09slPC81AdUsFKUGE2k+mYYtNXE3iYealkZxddx9+LgMP7Ox+ciQbLivm6j3KfAjxVnOoRwA5NsNHAoEi0p8xPLBuNrEpDMHhmhXaxgUI+mtD7Z0x80U+1VCS4i4UVhLiR66O8r5/j3wC+x8nyfC7F9cFqYJ+XdkJzwnNQXVLBSlBhM1dJXiCqYVoV241XF9PIiHj/DycMje5MJ9f9izl6/SfGwACoC9EZ8DQCFQEAiWGsWUk5/PFbPnsiz7J8Z/9EmRdvUzY1jm+FJ6ezz0hKLPIuApYFC99CJNILRC7mCvlyHY8NfHiBK6i53bO4EFhPfgiKTrIyO4dkTCU1PtUc1CUWoosWogjceaUC7G1jBKtKFQLK7t3oGJn0yEwA+8juESHibArUD9pMN9Hxh6blFDol85VWUhvNLrqSJ0BoYaEzeK6dkVmfx04CBvAJdu28Hjzj3eAvo6b/tAWJKde+9Lu3SOqqVcMPO/pRYDPAV4HtvgaBTRe3D42mQQ2JkN48dCn6cTnp/qjGoWilJDiay9FGr3PwUYHMWWXhaGdTqO9o23cTG5DAUuwo+X39LA1yfhcF/X75CdcyAslyHSZ7AP2GQMk51M7inGsMmYqH6Gl1et4XIo6gfxP+deoea3aEl2ofeONZ+hxQDPAnoDp4rQHVguwlHOfS8W4dIunaNGb3lbZwBw14CvEpnmao9qFopSQymtBtLp7dqWOwQV4GBBAbsP5TAVu4DfSz5vM4c/n346ozqfkFC4r7toTwgxF7kLuoEiU9pkbM+L0qKY8vwBjDFMdI6bgtUEfo/1pkwMBDgtczUiFIUR5/kDYXkUV8yey8uXXVzka4g3nw0N5BQU8G/gGmN4272vMXHrVYlA9vQnuGvcDUz7tGep81SdUWGhKDWUihAEiTAzczUXRfhFLvV62HnwYEKCwn2jnwVcs20H/3G2TwwE6JO1GmPgC0ereAzbNCiU8RQLAlcI+IPBEs7nCyh2Xh8CCAS4QKRIML20ag1fOhpL22CQ7AMHeW5lJuP69wESa7O6JhAo0T/8/EAgppnP29oxR9UCUmqGEpHzReQjEdkmIvkislVEXhGRkyOOay4iz4rILhE5JCIfiki3VI5VUZTylfRwcZ3wa4BLoCiX4WbgLH+AdoFAUf/qfkQPp3WjmLxA60AAjInqfP4n1nl9lsdDHhSZsyYGAogxtCI8j+LlVWtLfQZ3Dv4QCPAYJfuHTzGm1LnInv4E45s8Ffc+1Z1U+yxaYPNwbgKGYIMKugBLRORoABERbOn3odjfp1FAGrBARNqneLyKUqeJ5xdJpEZT5EI72dn+KDaaqAO2Du+geun8x+NhETZqqUfIpye2I8gMj4dB9dLZApxJdKFypsfDsJNOIGAMl0PUqrBuHsVQYGgwyHMrM0udg+HG8CbhTu/Q+57maBfR8LbOQLzeuPeoCaTUDGWMeQl4KXSbiCwDvgYuAx7CmiwHAoONMQucYz7H/r/8CauNKoqSAsrTG8INbY1caN1chjeAq7DaRJcY0UmRrVofX7aC/63MZBUlczQOYjPmN36/Kcyf4TIRW+gvCHzhbJsG9Fq1lt+c2iOqzyE04mwisBqbjBfAaigtnOvtMwbZtTvmXNQGqoPPwp3hQud7BPCjKygAjDH7ReRNbDSgCgtFSRHl8Yu4oa0TIGyhzQHOw77Zn4ONYkqPstBGK4j4w569BGLUomoEdGnTmk+3ZHMG0cNeTwV2Eq5xXBgncTFUs5oWse8m4ESs+eMmj4eAeLhs9oKiQovXdu/AsE41qz1uPKpEWIiIF/t/dzS2XvM24GVndxfs71Uka4BrRKSxMaZmdYhRlDqGG9o6GMIW2hyswHB7Qbgd5jo0a1bi/GitWksTXo8vW8HGzVtZjs05b4jVAsTrpb7Px/78/BIayT1xIpoi26z6nVIqLks8Hmanp3GwwE/B5n0cMv/CLbQ4aeFk3t+4nUfP71/KbNUMqkqzWAq4YnwD1uS0w/m5BfBDlHP2ON/NsRpnGCIyFhgLcGTjRhU5VkVRkmRm5mp+ZgxrsfWT3MV6T35+idDYkcCLq9bwm17FpqBYrVrj4QqYD7AO0VexpcLfx5bkuODEjvy4eh2PBoMJl1RPRLN689vvmLRwb4lCi7n+YXy6tQ9vb9jIBS1q/ppUVUl5V2NNlVdiXzY+EJFjnH0CmCjnxNXljDFPG2N6G2N6N28QWfBMUZRU4S7ab2HNQEuAoAiDOx6LAPdEHD8RwBieW5EZdn5oAb9EIq9ck9FMKKq8ewHFJTk+27SlqKxH5Kc8iYuJFlrcOn9lma5fXagSzcIYs87551IReQerSdwBXIfVIFpEOa258132VFRFUSqdWK1WP9iwkdOJ7kvoByzatJlxp/eJ26o1nnbh1p36KL+gqHHtRGxEVfM0H6c0b1YpuSk/HTyAXcIGY2O7OmA7Iv4Ct9Cit2U7/DuyGbNsLM/X0PIfVe7gNsbsE5ENFGuma7BaZCQnA5vVX6Eo1ZdY9aomBgLMDQZZV68eZ8awEfRo3izu+fEypaG47lRO5mo6Oud3BEZ5vTSLEm1VEQSNoTBgsJkbd2E9JVlYV+zbwMiiQou+Nhk1uhptlQsLEWkLnISdbbA5Fr8SkTONMZ84xzTFhka/WDWjVBQlESLzMly8wEARju9yUtxF+/FlK2KeX1pBxPIImrLy1vqN5Ad+BnxGqL/CGsIGIYzjmm7hjaF6n96uRjZGSqmwEJHXKa7RlQOcAIwD/NgcC7DC4nNglojcjjU73Yn1WfwtleNVFCU5ypOXUd7z4wmq8lbejcWMrM0UBB8gmr8C7sDYmJsiRGzp8rXD5nB4Z83Ky0i1ZrEEGA38EUgHtgAfA/cZY34AMMYERWQ48CDwBHbWPwfONsZsSfF4FUVJgmR8AtGS7srjUyivoEqGoDG8tX4jX+/eB9wGPE6xn8KNG+oGNOA/q7YUtdD1tq65pqhUZ3D/FfhrAsftAX7tfBRFqYU8uyKTpU4b1HGn9yn39VJVWDFoDDe/9xmLN6cRMM9Q0k8xEyswVgEd2HZwQ4lrjH5rFK/UMO1C+1koipJy3KQ9weZYlLdPeCr53/rv+HiTj4LgF9jSdZ2c70XAt9j84jys8OhTopOgr01GagdcQaiwUBQl5Ty7IhNjDG8AEpJjURP4+xcbCJrJxPJTwAPAIOB46nvnJ9VJsDqjwkJRlJTiahWXUtzlriZpF9sOHsKanqLRDdgIDKC+dw0Dj/In1EmwJqDCQlGUlOJqFW4/ionUNO3CA0Vpf5GsAjx0aT2Pe85qyaPn9681hQRVWCiKkjJCtYpo9aFqgnbRrpEPWwoxL2JPHjCRNI+fa7t14MKOtafiLKiwUBQlhTy7IpNgjC531BDt4ubTOuORn7B+iTlYp/YcbMeOjhQG/8WkhXv4w3ufEzTRytzVTKo8g1tRlLrD4k2bE6oPVZ256ITjeX/jNhZt2U5h8M/YcnbdgduBKwAPuf6L+HRrH6Yu+pysHfklelwAnNwpjeU7q+45kkWFhaIoKeP45s3IzM+PmTjXo3mzVA6nTHhEeGzoAN7esJEJn6whz/8MNnQ2FFtx9uW1N2B4gsgeFw+f2oGuj4yAW+bVmNIfKiwURUkZqUqcq2w8IgzvdDz3f7aWPH/syCiDD5vdbavR5vrHsXjLw7x3fD5Dj6hZrRTUZ6EoilJG2jVqTOzIqEzgELa8XcD5voe8QAv+nbkJgC6taoZWASosFEVRysy13TvQwDeZkpFRh7Fduo8DxmP79Y0HWgLr2bRvFyIg48dy7YiUDrnMqLBQFEUpI8M6Hcfp7Qto4OtDeGRUJ6ANtnR5ZEmQthQEbVHBmhRZq8JCURSljHhE+Pv5pzPlzBZ0aX07LRsMpEvr2xEOAXcTvSTIeAImrWjLUQueSt2Ay4E6uBVFUcqB6+we3qm4yVHXf84hYGI7voUgYLWLrfNXcu1UmDEvBYMtB6pZKIqiVDDtGjciXkmQnzVuVPRTTTFFqbBQFEWpYH5/Wkc8MpFoJUE8MpGbT+sY7bRqjQoLRVGUCmZ4p+M56+gA6Z7ehDq+0z29OfvoYJjJqqagPgtFUZQKxiPCY+efztsbNvJ81u1sO3iQdo0bM6Z7zS0wKKYWFbpyEZGdwKYU3KoVsCsF96kJ6FwUo3Nh0XkopqbMxdHGmNbRdtRKYZEqRGS5MaZ3VY+jOqBzUYzOhUXnoZjaMBfqs1AURVFKRYWFoiiKUioqLMrH01U9gGqEzkUxOhcWnYdiavxcqM9CURRFKRXVLBRFUZRSUWGhKIqilIoKiziIyK0i8qaI/CQiRkTujnNscxF5REQ2i0i+iGwVkeejHDdSRL4UkTwR2SQi40XEW5nPUREkMxch55wuIkHn+BIJoLV5LkTkZyJyn4gsF5H9IrJTROaLyBkxrllr5yLk2N+KyNfO38c3InJdjONq5FxEQ0RaisijIrJRRHJF5HsR+YeIlMhlSHR+qgoVFvH5LbYo/RvxDhKR5sBi4Fxsh5PzgNuAAxHHnY/N/f8CuAB41Dn+3goed2WQ0Fy4iEga8E9ge4z9tX0uegE/B+YClwFjsIWCPhaR4aEH1oG5QER+i/19mAMMBV4FnhCR6yOOq8lzEYaICDAPuBJ4APs8DwC/AOY5+91jE5qfKsUYo58YH8DjfPsAA9wd47insBnjTUu53pfAJxHbJgIFQLuqft6KmIuQ4/8CrAamOcf76tJcAM2iPLMP+AZYWMfmwgfsAGZEbP8XNqs5rTbMRZTnPsGZk7ER269ztp+Y7PxU5Uc1izgYY4KlHSMijYBrgGeNMTlxjjsK6AnMitg1E0jDvnVUWxKZCxcROR64C7gBKIyyv9bPhTFmnzHGH7HND3wFZLjb6sJcAP2B1kR/xpbAQKj5cxGFdOc7cl3Y53y7629C81PVqLAoP72ABsB2EZnt2CUPisgbInJsyHFdnO/VoScbY77HNuw9OTXDTQlPArONMQtj7K9Lc1GEiKRjF4Z1IZvrwlxEfUZgjfN9crzjavBcrAEWAhNEpLeINBaRPlhN6R1jjPt7kOj8VCkqLMrPkc73g0AAGAGMBU7B2qebOPtbON97o1xjb8j+Go2IXAX0Bm6Pc1idmIso3A20B/4asq0uzEWsZ9wTsb9WzYWxtqQLsabHL7A+zKXARmxDbpdE56dKqTPCQkTOdaI1Svt8nOSl3Tn8HrjCGPOBMeZFYDTQAbjKHYLzHS0LMqX1iitrLkSkBfAQ8BdjzI54hzrftXYuotznSuAO4B5jzKLQXc53bZ6LeM+Y6HFVXtO7jPPzDNAP66c40/nuDcwWEXftSHR+qpS61M/iM6BzAscdTvK6u53vD503CQCMMUtFJAerYUD8t4RmIftTQWXNxVRs9NMrItLM2eZ2rD9CRPKMMYeoG3NRhIhcBDwPPGeMmRSxuy7MRegz/hSyvUXE/uo0F9FIan5EZBg28ulcY8x8Z99CEdkIvA9chI2WS3R+qpQ6IyyMMYeBryvh0q5dMdZbQTDiuC7A5+5OETkGaAisrYSxRaUS5+JkoBvFAjSUXdg/jJHUjbkAQETOwYZBvg78LsohdWEuQp8xdDF0bfFroxxXpXMRjTLMTzfn+4uI7cuc787Yv4lE56dKqTNmqMrCGLMVWA4MiYib7g80xflFMcZsBjKBX0Zc4ipsxNA7KRlw5XILcHbEZ4azz81BqStz4f4OzAXmA1dFixyqI3PxOfZlIdoz7gE+hVo5F9uc7z4R2/s639nOd0LzU+VUdexudf5gbYuXYf0PBnjF+fkyoGHIcecAfmxCzQXYUNot2KiXBiHHXYjVNP4JnAWMwyZqPVDVz1pRcxHlvLuJnmdRq+cCOAn7h/6D83z9Qj91aS6c465znnGq84xTnJ9vrC1zEWVummIFwo/A9diXp+uxQmQz0DjZ+anS56nqAVTnD9bObGJ8jok49gKsFpGHNcP8B2gb5ZqXYt+e8p1fmImAt6qftSLnIuK8qMKits8FNmM71jGmLs1FyLG/A751nnE9cEOMa9bIuYjxLEcBz2EDYPKc72eAjCjHJjQ/VfXREuWKoihKqajPQlEURSkVFRaKoihKqaiwUBRFUUpFhYWiKIpSKiosFEVRlFJRYaEoiqKUigoLpVYjIneLSJXHh4vIx6EF5kSkpzO2Cq8oKgm2vVWUZKgztaEUpYq5IeLnnsAkbMObalEoTlHiocJCUVKAMaZaFINTlLKiZiilTiEiTUXkHyLyo4jki8g3IjIuogjkWY4pZ4Rz7C4R2Skis0JKr7vHthaRl0QkR0T2isi/nfOMiJwVclyRGUpExgD/dnatD+mDcIzzMc4xofc5K8o1vSIyVUR+EpHDzj26EAUR6SEi85wx5orIpyIyqOwzqdQ1VFgodQan2cxbwK+wTZouAt4FHgamRTnlUWyNoyuxhd1GOdtCeQ1bF+xO4ApsddTHShnKW9iCcQCXY1ut9ie8PHUi3A38BXgBW/r9fWBe5EEiciq2F0ML4LfOc+wGPhSRXkneU6mjqBlKqUtcCAwEfmWMed7Z9r6INAL+KCIPG2N2hRy/0Bhzc8hxJwL/JyJjjDFGRIY41/u5MeYV57j3RGQetktiVIwxO0XkO+fHr4wxG9x9IQpOXESkObYi69PGmNtCxhgA7o84/AFsQb7BxpgC5/z3sD2fJ2AFjaLERTULpS5xBrbs80sR22cB6di3+1Deivh5FVAPaOv83A/bd/31iONml3ukpdMNaIQtCR7Ky6E/iEgDbDvPV4GgiPhExIdt5fkhdk4UpVRUs1DqEi2APcaY/Ijt20L2hxIZpeSe57aK/Rmw1xhTGHHc9nKNMjF+FuNekT+3ALxYDWJCtAuJiMdEacykKKGosFDqEnuAFiKS7ppjHNo539HawcbjJ6C5iKRFCIy2sU5IgDznOz1ie8so93bvtSZke+S992G1qcexPVZKoIJCSQQ1Qyl1iU+wv/OXR2z/JVAALEnyekuwb+2XRGyPvH40XC2lQcT27c6+rhHbh0X8nAUcwnaoC+WK0B+MMYeARUAPYKUxZnnkJ4GxKopqFkqd4h1gMfCUiLTGvpFfCPwfcF+Ec7tUjDHvi8hi4GkRaQVswLYT7eEcEu+N3c27uFFEZmCjqLKMMQUi8l/gNyLyLfANVlCcFXHvfSIyHbhLRA5gI6FOA34T5V63AguxzvfnsFpJK+BUbAe6O5J5bqVuopqFUmdwzC3DgBnAn7EO7GHYxfSuMl72Umz47V+xzub6FPsG9scZSyY29PUirAD7AjjS2f0HbEju3cB/nWveXOIidv+9wNXYkNkhzvUi77USK0h2A3/HCpZHsU7yhQk9pVLn0baqilLBiMjj2B7cLaI40xWlRqJmKEUpB06m9RFYk1Y6MBS4DnhABYVSm1BhoSjl4xBwC3A8Ngfje2xW9QNVOCZFqXDUDKUoiqKUijq4FUVRlFJRYaEoiqKUigoLRVEUpVRUWCiKoiilosJCURRFKZX/B+JSRGcshbaKAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plot_classifier(X_train, y_train, knn, ax=plt.gca(), ticks=True);\n",
    "plt.ylabel(\"latitude\");\n",
    "plt.xlabel(\"longitude\");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.74"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "knn.score(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.69"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "knn.score(X_test, y_test)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- $k$ is a hyperparameter\n",
    "- What happens when we play around with $k$?\n",
    "  - smaller $k$: lower training error, higher approximation error."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "knn = KNeighborsClassifier(n_neighbors=1)\n",
    "knn.fit(X_train, y_train);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYsAAAEQCAYAAABBQVgLAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAABf00lEQVR4nO2deXhU1dnAf+/MhB1kBw1udQNZBUQUcAEXFAUUpWqt0vYrVawt+HXRGhAV1FYrWKu1VL9qwaUIKqjVqrghLgjBQAIILmyRQFgDhITMnfP9ce5N7kzubNlmkpzf88wz5N47d849Cec97y5KKQwGg8FgiIUv1QMwGAwGQ/pjhIXBYDAY4mKEhcFgMBjiYoSFwWAwGOJihIXBYDAY4hJI9QBqg3bNm6nM1q1SPQxDPUeVlbG75fGpHkbC+AIB2u3/BsnIqNPvrW/zZIjOjvw1u5RSnbzONUhhkdm6FQvGjUn1MNKaotJSfvP2ezx88XDaNG2a6uGkHdbuApRl8cygOakeSsK06NSB8W+MI9A5s06/N7gzv17NkyE6f77j+M3RzhkzVCNlbk4uy/O3My8nN9VDSUuUZZE5ZVKqh2EwpA1GWKB32RNfe5Oi0tJUD6VOKCot5fnVebwKPLc6r9E8d7LMXNYv1UMwGNIGIyxofLvsuTm5XK4UI4FRSjWa504Ea3cBwZ35dBvRP9VDqVdMWD4x1UMw1DKNXlg0tl2287xTLQuAaZbVKJ47GbqN6M+MAzenehj1hrr2kRhSQ6MXFo1tl+0878n2zyfTOJ47EazCfJQtRA0GQziNWlg0tl125PM6NPTnTobMKZOMVmEweNCohUVj22XPzcnl7FAIP/Cd6+UHBodCDfa5E8HaXYApwGwwRKdB5lkkgrPL/txjl33W6jxu6NurweUfbNqzl9VNMjgvynnfnr11Op60IqRDZU0ElMHgTaMVFpG7bAf3LnvSoAEpGl3t8NDIC1M9hLRFKdjWoV+qh2EwpC2NVliYXbbBwSrM11rF4lSPxGBIXxqtsDC7bIOD0SoMhvg0age3weAk4D1rtAqDISZGWBgaPW/1np7qIRgMaY8RFoZGS3BnPgArPilI8UgMhvTHCAtDo8QRFPNHLUzxSAyG+oERFoZGy/xRCyku3J3qYTQYWnTqkOohGGoRIywMjQ6rMD/VQ2hwiMD4N8YZgdGAMcLC0OhQCnInLzZaRQ3i72QqzzZ0jLAwNCocrcI4tQ2G5DDCwtBocIoFqhmmX7TBkCxGWBgaFSYBz2CoGkZYGBoFTmMj1WdwqodiMNRLjLAwNBpMCXKDoeoYYWFo8FiF+aaxkcFQTYywMDQKjFZhMFQPIywMDRqjVRgMNUOj7WdhaDyoGXNMYyODoZoYzcLQYDFahcFQcxhhYWiwOAl4Jq/CYKg+RlgYGiTWbl3OI29X1xSPxGBoGBhhYWiQKMsic8okUwOqjhn/xrhUD8FQSxhhYWiwzNowItVDaFQEOuvKswPPMdpcQ8QIC0ODI7gzn24j+psS5AZDDWKEhaFB8kizu1I9BIOhQWGEhaFB4fTWNlqFwVCzGGFhaHA8M8j0qzAYahojLAwNBkerMBgMNY8p92FoEFi7C+g2oj8zDtyc6qEYDA2SlGgWInKZiHwkIgdFpEhEVojIcNf5diLylIjsEpFDIvKuiPROxVgN9QNlWakegsHQoKlzYSEivwAWASuBK4FrgJeAFvZ5ARYDI4HbgHFABvC+iHSr6/Ea0h+rMN9oFQZDLVOnZigROQGYDfxWKTXbdeq/rn+PBoYCw5VS79uf+xT4Dvgd8Ku6GKuhfuAUC3yr93Qw2doGQ61R15rFT4EQ8GSMa0YD3zuCAkAptR94DRhTu8Mz1CccQWHKehgMtU9dC4uhwHrgWhH5RkSCIvK1iNzquqYnkOvx2TzgOBFpVRcDNdQP1Iw5pgOewVAH1LWwOAY4BXgIeBC4GHgH+KuI/Nq+pj2w1+Oze+z3dl43FpGJtqN8xd7DJTU7akPaYe0uML0qDIY6pK5DZ31Aa2CCUupl+9h7ti/jThH5CyCA1zIgsW6slJoDzAHo1bmjWUYaOMqyTK8Kg6EOqWvNwqnB8E7E8beBLsDRaA2ivcdnHY3CS+swNCKc5LuXPu2Q4pEYIhGBXrNHc9PoVI/EUNPUtbDIi3Lc0RpC9jU9Pa45HdiilDpYGwMz1A8cQZE7ebGp/5SG+DtlIjFtAIb6Sl0Li1fs90sijl8CbFNKFaBzLDJF5DznpIi0Aa6wzxkaOfNHLTTRTwZDHVPXPov/AO8DfxeRjsC3wNVoR/dP7GsWA58C80Tkt2iz051o7eNPdTxeQxph+lQYDKmjToWFUkqJyFjgAeAetB9iPfAjpdTz9jUhEbkceBh4AmiGFh4XKKW21uV4DemHydI2GFJDnRcSVEoVAbfar2jX7EEn8P20rsZlSG9MRVmDIbWYEuWGtMcRFKZPhcGQOoywMKQ11m7tyDaCwmBILUZYGNIWqzC/PPnOYDCkFiMsDGlN5pRJJkvbYEgDjLAwpCVORVmDwZAemLaqhrREKbuirNEqDIa0wGgWhrTDKsw35ieDIc0wwsKQVgR3avPTf9XIVA/FYDC4MMLCkHbkTl5saj/VcyRrIi06marADQkjLAxpg8nSbhj4O2UCcM3ZpoZXQ8I4uA1pgVWoBcX8UQspNlpFvUfEu4OZof5iNAtDWuBEP5mKsgZDemKEhSHlOOYnE/3UsJCsidw15MtUD8NQQxhhYUgLTO2nhoXpmNfwMMLCkFKMU9tgqB8YB7chZViFuvOdaWhkMKQ/RrMwpASn9tNbvaeneiiGWiR/1hMMPKdrqodhqAGMsDDUOY6gyJwyySTfNWAcv8XINdNTPRRDDdDghUVRaSkTX3uTotLSVA/F4CJzyiRmLuuX6mEYahufP9UjMNQQDV5YzM3JZXn+dubl5KZ6KAZM6XGDob7SoIVFUWkpz6/O41XgudV5RrtIA8pLjxutotGwbUl2qodgqAEatLCYm5PL5UoxEhillNEuUowTJvvSp6bAXGPB30E7t7NaP5nikRiqS4MVFo5WMdWyAJhmWUa7SCGOoMidvNiU9GhkiN/4LRoCDVZYOFrFyfbPJ2O0i1RjSo8bDPWXBiksrJAK0yocjHaRGoI7802YbCNn25JsUyeqntMghcXuw4c5OxTCD3znevmBwaGQ0S5SwKwNI1I9BEOK8Hfoivj95M96ItVDMVSDBlnu40jQYnWTDM6Lct63Z2+djscAU05dwszCfqkehiFF+Dt0NXXA6jkNUlh0O6o1C8aNSfUwDDbOrnKg8VkYDPWWBmmGMqQXTvhkr9mjuWl0igdjSCmmL3f9xQgLQ50Q6JyJ+P1I1kQmLJ9oFo1GiAiMf2Oc+d3XU5IWFiKSKSKPiMgKEflWRHrZxyeLyFk1P0RDQ8HfoSuBzpmA9mEYGhf+TpmpHoKhGiQlLESkJ7AG+DHwPXA80MQ+fTzw6xodnaFBIqJLV09YPjHVQzEYDAmSrGbxZ2AdcCJwFeBunPgJMLiGxmVowPg7ZZZrGBOWmz7NjY3bS2amegiGKpCssBgKPKiUOghE1g7dAZguJ4aEcfwYJv6+8RDonMm2JdmmIVI9JFlhEYpxriNwuBpjMTRCnEipCcsnmkgpgyGNSVZYLAd+EuXceGBZ9YZjaIyYSCmDIf1JVljcB1whIm+jndwKuFBEngWuBGIaI0XkfBFRHq99Ede1E5GnRGSXiBwSkXdFpHeSYzXUI9yRUtecbarSNnR6djTJmfWNpISFUupDYCzawf1/aAf3g8AwYKxS6vMEb/Ur4GzX60LnhIgIsBgYCdwGjAMygPdFpFsy4zXUP0Qo1zAMDRPnd2z8FvWLpMt9KKXeAN4QkZOBzsBupdRXSd5mnVLqsyjnRqMd6cOVUu8DiMin6FqAv0MLGkMDxYnFD+7M56bR8OziFA/IUOP4O2WaOlH1kCpncCulvlZKfVIFQRGP0cD3jqCwv2s/8BpgCj41EoyGYTCkF3E1CxG5MZkbKqX+lcBlz4lIR2Af8F/gDqXUFvtcT8CrhngecKOItLJDdw0NGLeGMWH5RNSMOUbLiMPtJTPZlupBGBosiZihnon42cmvEI9jALGExX50Yt+HQBFwBvAH4FMROUMptRNoD2zy+Owe+70dUElYiMhEYCLAMa1axhiCoT4R6JyJtbsAsibCoDmpHk7a0qJTB7a9kV0eJFAf6DV7ND3NJqDekIgZ6kTXaxiwDfg7cD7Qw36fA2xF+xqiopRapZT6jVLqNaXUh0qp2WhHdhcqfBFC5YQ/53ise89RSg1USg1s17xZAo9lqC+YXIz41LdaW4HOmUjM/9GGdCOusFBKbXZewG+BF5VSk5RSHymlvrLfbwH+jXZAJ4VSKhvYAJxpH9qD1i4iaWe/m85FjRBncTF+jMoMPKcr+bOeQPz+VA/F0IBJNhpqBPDXKOfeAW6p4jjc2kQecLHHNacDW4y/ovFiIqW8cXIWHA2sPnFMYTbrVm1m5ccvcWB/Pq2PymTA0Gvo3ncM4jMdFNKJZH8bpcDAKOfOBI4kOwARGQicCjg5GouBTBE5z3VNG+AK+5yhkWM0jApuGq3noj5qFSHxcc21k/lg0bPsyJ9C8cG32ZE/hXdensvieVNQoVjVhQx1TbKaxXxguohYwEvo4oFd0KU+7gaejvVhEXkOnS+RjY6EOgO4E8gHHrMvWwx8CswTkd+izU53orWPPyU5XkMDxERKhSNSP7WKV3YU8cn2XZTyGRU1SE+hrGwUmzcOZf3qxfToNzaFIzS4SVaz+F+0kHgA+AYdlfQNcD9akPxvnM/novMo/okOmZ0MvAycpZTaBaCUCgGXo81aTwCvABZwgVJqa5LjNTRg3H6MdEKFQqxb9QrzHruev804j3mPXc+6Va/Uyk752PefrPF7JkpRaSkTX3uTotLSKn3+yRW5KIrxE1l1uBllZXeycun86g/SUGMkW+7jsFLqx2j/wQT0jn8CcLpS6kalVEmczz+glOqjlDpKKZWhlDpWKTVRKbU94ro9SqmfKqXaK6VaKKVGKKVyknoyQ6PA0TLSJVJKhUIsmjeZd16ZV+umlZtGw7Yl2SnrQDc3J5fl+duZl6PTopIRHkWlpew7vJ9FKDJ4BG1ocNObA/tNlnc6USUPklJqg1JqrlLqT/b7hpoemMGQKOkUKbU+ZxFbvs6n7MhSdFmzU4BxlJV9zOaNW1m/uubsZd12f1mn4aduYVBUWsrzq/N4FXhudR75RQe4dsEiPncJj1jMzcllDMJI4ApC+Hkk4oo1tD6q/uSMNAaS8lmIyHHxrnFlYhsMdUa6REqt/Pglyo7cAUTm+jShrGwIby98kA9en1XtqJ+7hnxZ5+Gybk1CAZcrxUhglFLc+e4H5B84yA/RwuOGvr1o07Sp530cQfO5HQB5P4d5jUewuB1oC5SQkfEAA4YlVTzCUMsk6+DehHfCnJv6F5ZhaDCIAFkTmQA8k4KMb2066RNxNATcAGwgWPYYwbI+FB9czTsvP8CGNR8w+oZZVRIYdenYdmsSP1qdR0gpvrAsACZbFkN3FrII3bfgolCIeTm5TBo0wPNec3NyuVwpTrZ/Phm4gmJe5gQsTsTv38dxJ/ele580sCsayklWWPyUysKiAzAK+AG634XBkDIiI6VyJy9mxSd11zuh9VGZFB9cjTY/ObwAfA18TIXGUfWon6zWT7JtVnadmqCcBX4kcIll8RWUL/avofsWjEQvBG1DIeZG0S7KtQpb0Djcj+I1yrCYAjxCVR9OhUKsz1lk8jZqgaSEhVLqmSinHhGRuWiBYTCknEBnXQa71+zRrKhDDWPA0Gt45+UHKCsbRYVgeBr4PZVNU07Uz6ykQ0TF769zrcJZ4O9VisFUuKQfA5x+A1noBjXRtIu5ObmcHQrhR8fQO/iBIVh8wNdY1mdsqYIQdYILtM/oDiBcg7vi+j/z1ZrXjCCpIkn3s4jBPHRIbFYN3tNgqDKOwKjLXIzufcfw1Zr32bJxKGVldwK9gY1UNk05JBf1k9X6SbYtya5zX0Wk2egy4F60mWEUhJ2LpV1s2rOX1U0yOC8UoqjMwlItgYrzGazBqqIQDQ8uCNfgNm0cynN/G8/eQvEUJNFMgUZTqaAmhUVnKm+dDIaUEuiciVWYD1kTuWvKJGYu61er3yc+H2NumM361YtZuXQWB/bnc6TUIlgWaZpyiB/143SU6zV7NNugTivLRjMbTQP6or0xayI+kwUMBgZYViXt4qGRuilmcGc+Q97/moN738R7XryFaKzFO3pwQTOCZXey8/tfgvqOWKZAFQqx7stXWPbO3ziwfy8qVIZu1Nkf6Erxwc38598PsvLjuVx/y3x8gZpcQtObZKOhzvU43ATohc65WFoTgzIYahJ/Jy0w8mc9wU11oGGIz0ePfmPLd8XrVr3iYZqCWFE/LTp10P0plmTDcvu+Qp3nVMQyGw0T4TulPM+dAazwCWvWrq+kXQR35tNtRH+af36Qor3Rhejh4oPMe+z6cmEAxDQzeQcXOPQGFSCWKbB7n9G8OvfXfLf+C5Tqis497gOsRovHTsC7QC478u/m+b9dz49ufbHRaBjJisUPqOzgdjxRH1L1QoIGQ63iCIxUREp5m6bWkJHxAMefcmxY1I9jZgLdC6AufRNelJuNPM4dOlIGIpzXJMPzs0c3bco3+4s8fRfblmQz+VfXMC3LW4jCDFToTnbkH18uDE7pdV5UM9PmjUNp3qolemH3Fj5wbJSn1FrM+pxFbN6wBqW6AR+FfYc2rp0NrASuB0ax8/uBjaokiSgVLxLWdbHI+VQWFiXAZqVU3YWcxKFX545qwTjTgdXgjdP/OVGBEc9u7V7gHSLvrUIh2zQ1v+Iew8Zzf9ZofD5fWMmS+tTAKBpFpaVcOvffPBcM8qNAgDd//MMw7cIqzMcKKSZ/9hUf7+oYJkThQXRt0bnovOESMjKG0qxlMQf2XWRfswU4DvgZcB3wCm3a3cnhg20oK3NHnQGUIDIQpS4EZqONZy+gAw+2AK1p0+4QzZp3Yuf3xWhD2jiPp1oIPA68V/5zl8xZ3HDb8zUwY+nBn+84fqVSyrNYbFLCor5ghIUhHokKDK8IG1hdrhWMvmEWU4+aQ/4HOeUagHPvaE71ged0ZeSa6eUCxokSTVXZjtrg8eUrKcrJ5VnL4ka/n7Z9e3nmXRzZsY3Pe/di6tOfs3P7WlToeHTbnGsJLzDxb+DnaCFyJxXmIUewTKV5y5Ecc3w/tmzcVkmDa9/Zz+4dQYLBpcD/oEOZf19+H7//PhTbCFmtgbfx1k42oF37X5f/3KLVJdyS9WH1JiuNiCUskvVZWMDZSqnlHucGAMuVUiYpz5D2JBopFSvCxnGMMqzyvd0mr/mjFgIw/g17t7oc8iX1JqbaItIpPs2yOCtK3oXfJ5ydm8e7Z7XirLePovjgfLwX6kVoM9LHaDepoxnsQe/0d9P6qMxKwQVag7uR03pdzuLnb2fTVz2xrHbAJ7h/n5Y1Ct0toZTYpqzjwn6uryVJomnLsUjWZxErU8ZP/OxugyFtcC/q0RzfsSJsysruZNOaWWw70qpSKKs7ObBcSNAwTEzx8Aq1HaVUJd9FUWkpv/lsNQ9fPJwW+3dxavsmfFkpodHhPbQJqAk6Gz5cM4BpiLQGCAsucDPmhtk89dDlFO29C6/fp1L3oPOOH0T7KCr7UbTWo38O1NOSJLHyUcB3UrTPJeTGFxGfiDj/G3z2z+5XS+BSYFe1n8RgqEP8nSqKEHpVrY0XYbPj63Uxy24EOmeGvRyqW947XXG0iqmRobaWxXOr88Ke111rSvx+brngNDIyHkAvzG6K7VcfKrLhP8JdqBFWsntnMGahRvH5CJYdItbv0+cLIJKPdmYvRJueFgIDgKPQIbQLCWQM5YSI4IT6Qqxil9CkTbTPxRUWInI3UIbugqeAZfbP7lcROrbspWo+h8FQ57gFRmTVWm1mWB3lk2vo0rxFuRaRjACILO/dUIgMtXVefmCwndUNVKpaW3SkjKsH9uS4U7qRkTEU90Lt9/dEaxSriZUNH0ygB0a832eno7szcvzvaNOuGPHdjPgG06bdNPoO7k/nY8po0epSumTO4uJxN1a5plcqUaEQH789J6q2DMdEfaBEzFAf2O+CFghPo6P63JQCa4HXExmwwZBuuM1GWa2fZMaBm4Fo5TsASmjRbAY3ntKu/IhbAEQrogeVi/LFqtBa34gVagvg27MXCK81NUopntvwLTMuHujpczhc3JKivSG0GaiI6mTDx/p9OjkvPfqN5fQzrkrmsdMaxz+xYul8du/4CssqI/ocNo96n7jCQin1ITqHAhFRwFNKKdOVxNAgEb+fbUuymcBEnhk0J2aOxOB2e7ns+NOB5ARA5EIZT7jUJ5wM7Vh4OsA3bOLkj1fx3Ec3lAuJ80dNpnvfMTx5/wX2JwVdkWo42nTihM06m+H4Dudkcl7SnURKkXhH840nuhP/cNTvM6GzBoMH7vDXZ16tnCMx+dfjuezTN2jSpRuQeKiok3/weTDIyWjr+1keeQgNGfdcOVzn8/EyXTgSegxdEOIJYD5+fxlIBlYQ4GjgLiqHzc4FjpCRMZSLxt0YN0kuWs5L9z6j641ZKZGQbvH5dPWAV+ZFRPM9h843cR8D7Stqp5Q67DkJcYWFiPwfcJ9S6jv73zGfQSn1s7hPWssYYWGoCazCfJTSoa/FhbvDS3BQUX4jGQHgtVDGEi4Njci5cvga6E1LStgC/JLwaKdfAvnACiovbkOAYfj9H3DiaSfWSz9CVfAWAqCjtIZysS005/7lOnZ+fzvhSYZOf5V16AREJxlyBrDaUsrytDgl4rO4AHjU/vdwYofHNjw1xdBo8XfKrBT66lWCI5lQUc+ifDHyEBoasUuUB/mAn2GRT3i5jTLgHrwdsn8AJnLxVVPpccbYRiEoAFYsnR+zaOKb829m5dL57N65FjgdrU04GevHAT9BR3bdjI7yOg4tQH4XdQ1PxGdxouvfJyT1RAZDPSdeXkQyAiDWQjk4Tne5hkIlB3goxL4yC6WOAvxk8C4WJ6NNUc6itoGYBQIJkf3JC3TvW3/MSNVl3+7viDUnKtSKHflTgKnAJUBXwvNSHgRaouf5fftzC0FHvXpSlaqz2Uqpgx7nWgIDlFIfJXNPg6G+UVRaym/efo+HLx6elABINFKoIRPpALcK87khdydffnUn8Ap6IctCL2pfAr9Gm01iZVX3ZUf+Lp7/23h+dOuCtBYYifbHiHddMGgRe05ORJueioA/410YcSCU/zWWIHI3SoV2RBt7shnc76OzVSqV+wC62+dNuQ9DvcctECJNQ+4Q2WQEQCKRQo2RW8afz6T7f4dldSLcL7Ec6IYuZh0tq/pBYArQlB35t6Z1Fdh4nfwcf0si14VCB4H7iT0noJ3/0Ux496AFyULgbjof04od+eyJNv6aLPfRFLBinDcY6g3RciYiQ2QbUxQTxBaiVWXchedy20MLsCwn2c6pCvs74C/AlcA76CJcd1DhkL0b3YLpWrRDPIOVS+dXS1gkGo5ale55MTv5bRjCksXTKNj6LXt3fcuRIwrU/faz+6hUj4zmQGuPOXEixK6177+F2Ca8tcCtdMnsxvW3PM+srKjVPuILCxE5gfDe2gNFpFXEZc3RRVW2xLufwZDuxMqZSNcciaos4k54cDJNlRJNPEx0vH/s34NjfD4CAT9H6ENFpM7X6EWyj/0+D3gRXR9qC7ox51Z0Q6JT0QtnM4r2ReYLJ05CO/pgkP+bdRFFe4vQe+MWFB9syX8XPBuzPSvEqjPWhGCwNas/X4VS0wn3K7xLRan2ikZNbdq2p2jvPmCyPSc5aB/Pb9F5FE6hxb32z78hPCcFYA2BjBZcPO6OhMKGEzHu3WSP+B10tNNj9s/vuo6/BlwB/DGB+xkMaY2XQIDKdY+86h15YRXmV/mVzJirUj5EzZiDUiT0XZVKdFSjrpUz3uc2biIUChHIaIFeIN21n04hvDSHcr1/h65C+0vgTbSfowUhy0KFQlUa07ovX2HTV6spO9IS7Qz+BVBCWdlHbN64lbVfvszf7j+Por0tgTnAF+jlcCuWtYVv1q1iXc6rle6rQiHWrXqFndvX4r3LfwE4iFJfoDWJ5WgB4FTUvQ0tRMHJUh9y0S2I7LSvuxVdth1gLHAjOoD1VuAze25mAz923UcXQrx43B306JdYFFkiZqhn0CU/xB75rWjdxU0psEEpFdXeZTDUB2KV1040RDYSpfSinCzHvv8k+e9lx931V6d8yLOL4a4pk8if9QTW7oKYJdNrSqsKG+/GzWy442EOFR1Gx/kfRUXtp5+hd9eXEt6DYhM6FDS8zDiMIlg2OK7fwsuM1H/I1SxZ9DCW1QW9OLt39/+hrOz3fPj6bZSWdCTcr1LRRU+p3by36M/06KsX31AwyJLXprFm+VsoZaEt9X8FZhG+T38avdhHq6h7t318HrCGVm2OAYSMpgGOlGxEL8n77bnrAXTwnBvtbp4FnFCljPVEQmc3A5sBROQCYKVXNJTB0BCIJhCezs5hQd76pHIkrN0FKPv6qvT9vmn0zfjemxj3uqos4o4JCmDmsn7cNGMOZE3EKsz3FE7J9KhIZrwjQyFeXv41lvUtWiC8RcXu+zrgDXR4Z1sqFsDh6Ezuyk5by5rKyqWzogqLaKamtxfeh2VZ6H4ZLeyr3Yvscg4XlxLdWZwFPEZpSSHrVy/mtF6X8/cHL6T4YHPg74T38r4eeJ4KgbGZyhV1ne84CR3RdAdaCMCuAuHthc9iWbNc952KNjkFos6NHuP/0LxlWy64YkrSGetJObjtOlEGQ4MkVs5E/zVrOVuppHIklGXRbUT/8qKEdTHmRBZxa3dBpXE9u5gKgeGhYVRVq4o33nuU4mVrHzq8fx46UcwJCfW5jrkXwNhO2wP786M6oZVSnk5my3KEwqvAZQS4miAL0EIqC6317ERrNcOp3Na1Nzpl84+sXDqLbd8ttwXFSirv8Aego5VuRTuli/GuqOv23zxhP/NfCYWWAJ963Hcg2o8Ty6ENbdoeU6UggGSjoRCRS9Bpf6dRWXwppVR0d7rBkMbEy5lY4fdzXob3f5nIHAnHB1CbggJqbhGHcJOUm5rMPPca7xUIC7kfH9kEmUR4mKwPOEj4AngcsXIMWh91TBTt4X6QzQTLjqUi6c9Z7J2d9+MEWIuP9/DzCBb3ohfZIvRfwnNUrk/1H2A02oz0Z3bkr2VH/lq0j6BJxPiaAfeitajngRDHndyDLV//El3E78/o8Nfr8NY01hA7FPZ/Ys4NwP69VQsCSDYp7zK0M/tddF7FW2idbQhal1papVEYDGmAV3axm3OOOTpuroTbvJM5ZZLu/lJF8nZ1pZfS9/TKJK/yIh6KHuE+a8MIxvMEwZ355WVNairzPNp476eE15hNiCA+zsDHVoKcTUXdotaEL4COL8O7zHjXY09nbfa6CO3hJILB+eiF2En6cy/2c+3v+o4Ay3kFxZU8gsXt6EW2Kdqh7uULGIZ2hncD/td17/vRAsOJZnLobb9n4vNtZfvmErQvI3JM31O5d0e8UFgfcJ/n3DhZ28Gyg6hQKOnkxWQ1i6lo9/sUdMGWLKVUtoicCvwXHZZgMKQl7oXciwf690A7CCvInDIJoHy3Hc0J7BQdBHhmkO3M9hAUycTor/ikACYvptdsbydkVRZxa3cBSsFbvafDJwWV7llcuJtnBs3hriFflj9zTWWeRxvvIcBPGa8AV/IoIYL4OQuLn6NLUrRGm4GcBdDxZYTnGDhO2+1bvvEIUX3B/lYvs9AwdFhuUwKEGE2IkcAVhHiZP2HxDlpDmIr3jv4OYBLejm/n3te7PrMGvfT+mlDoz4RCH0f5nJdgiK1V6eu/gTBh686/+B3BstuqlLyYrLDojvbQhNDxawEApdQGEZmOns3YraoMhjhYuysvYp6ELJKpsN9tRH8eaXZXwtcXF+6uWPAHzdFtV7MqO5wdIZQ7ebFe4KOQaAZvolR1Ee82oj/PxBgnaKf3BPSzPXhmL/wdqp99Hm28e0pKuRwYDAQo4yXgSl7H4gz0jng52rziFg5XIvIbmjSdgj/gt8uM30j3PqPt/heRi2z0Dnv6no8hFBBgKzPtEN37Ocxr/BGLq4BVHvd06I0WJtHu/TgVwqIELfg6Ezu7+g50VJYjGEJoE9hX6GU2Vub2VLRG4+SkHGcfr17yYrLCIgQElVJKRArtUTilP75Hu+4NhnKSyRUAyhf/biP6J3T9W72nx1ygwzgAHNgd+/tj7Pxf+rQT46kwCznaRLmzOM44YmXwOpm5yfwHru3yIW4NI15YrRdO4t3d5w3lng8/9kwYLCot5YJ/Ps8MFI8CV4G9q/fzMiOx+BgYhA4f3YJeaDfTtFkrho/+X3r00xqZ0z/92cW6dWrxwcjddzzzTQ4ntQpxVrFwckj/EWp/SoiX6YbFbmLv6I+Nce+N6GKIzg5/B/AnYHqcMR2yn/tStOntQ6ALOkR2CLrirqM5TLf/fS3wFHA8OtPBe6zxOgp6kayw+Ao4wf73CmCyiCwDgmhj3aakR2Cod8Qz50SSbI7Bs4vRC3siJCooEiCRnX+ubRZyaxPxdukO0TN4KzJza7OukTuUN1FmLutH1oj+5T08ksFJvJv63odkF+z0NIvNzcnlChQd0eltn9nH76eM15iNxXvofN/baNLUR7uOP2DAsOl07zOaCWN9HPv+k3ps9pZ1AtDxV9cwLctpndoEbYI6TKzF/ujmAXaX7GN6hJ/qfuB1ZnOYXxC9FtPdQDTBnYN2jg9BL5MBewzjgYdjjglaoQXL0WgB0Q3t7G5CRTb7ZrRQsdCak49YPh19fAitj1oRZbzRSVZYPEeFUfdutKPbca1bhBvmDGmCtbsgqlOzKo0Skw4HrUKOQSpIZOcvvrGsGOQSfkkIK72bq3r/6GoTssicMokZy/ol9bEZB27mrilaw4hVGsRdcgTg+dV5zANuLNjpmTBY7vBG5xtfARFRUhYv8xAWwvARZzOrTTE+ETjyH+Z3+RmSNY58vz/M+W8V5jOxRQFPndKNzRuGEAy2RkdTXU20xb5Fk3vp3c5HRkG0Phvwhf9f7Lcy0OGp9xBen0rQ+RklEfcuRmeYH2t/xp1rcYz9ObcvpmJM+r7XoMNrx9vH3Wa066lYbhfa93kKHWrr7dPRguIkAoGlDBh2E8mSbJ7F465/rxSR3mgdqTnwrlIqMrM7LiLyFrrg+kylVJbreDvgIXT+enN0YPEUpdSaZL+jMeOYShxHrRezNozQ9vlESXTXX8+o7Z2/t3nEwbt/9IpPChhp7+zj9daoTWYu6+cZVuvGXXJEAZcrRR66gIVXwqDj8D6ELhf4ecT97qeE11nIpccfw91tivH7pFJDKrdpzPlbf7vvPYy54XuWLJ5m11v6HL2gvoLOcXDCYbVT/IIeAoWE+VMOlwVRlkVr9J97ZnPhjFZtWbqrGCs4A90LvC16Z98N3WAocnH+NSKd7TIeXtnU+9EdAN0CKAftXwgAr6OFSyGQQWyT1T60YWchjk8Hbgd+jg5YPQ4YQiCwlBNOPa5KvcaTzrNwo5TaBvyjqp8XkevQZSMjjwt6P3oi2suzF50P/76I9LO/15AgmVMmMTPmbjIJQdGAqe2d/4Ch1/DOy455pHLI54BhN3p+7pFmdzE+rC1m8rijtaqKY5LyKkESWXIkpBRLLIuLqTAtRYb0Og7v4WVBBliW565+KIquzZqU9zoH74ZU+0pK+MnbHxPofCIbLsokFMrACh5G92u7EB28+Wd00toMYBMifi68cir/GlbA9o9Wlwue8tavQEdgDLD6cCl/P+8klu3cxaM7j2Lfrr2ErELKjpxEKNQW7Zc4t/w7RMpo6jtCifUosTK+tSAYQoVJaS9aE5lGhSbyS3TBxC/R9jZ3x7ufoc1Sbe0Z+5X980Hgelq3fZMWLTvb/rcVDBh2U5V7jSdSddaJfEoEpZRKSACJSFt0oZIp6OwUN6OBocBwpdT79vWfov+WfoeeEYOhRqnKzh8SD4ft3ncMX615ny0bh1JWdieRIZ9V2e0lQ7cR/eOaoELBIO+9djd52UsIlhUTyGhBz/4jGH7FPfgCAbZe4F2CxF3C4xLL4isqqotGJgxeu2ARL149ptxB/9u33iWnYAdnWyFKghYhpRBfgBbNAmQEgzQttWJW1Q0pxY9feYstBw5hHSjC4nj03tJdW6kXuhmQD73jLkGpATQ9qg3fv/8m4q9ow+NOHLwHrfGcHgoxf/seJp5wDBMfmcQNf/2Od16ZRyi0lAofwlPADkTK+OGJTXh9SylwAgEudGWDO1RkfGtB8R7ayv+A/Y1uTaTIHvMv0Y7ryI53W9FegNZoY0xT4DEyMpYxbOQva8wPlsjCfi+101v7T0CeUuoFEfESFt87ggJAKbVfRF5DC3ojLBLAiak3JEZVdv7JhMOKz8eYG2azfvViVi6dVSFY7JDPaLs9dw/wquLvlMm2JdncNeXLqFpmKBjk7w9eZJepeAzoQ7BsNTmf383GtRfxi9+/g9eSEZlsd69SDCLcYe0wzbLoc+AgT6/MYco5g4DKUV3BnfnlgQSOj+Tx5SujlkZfsHYD+fv3sgi4km1YfI5uIwrR8x10JvWGlybD8J5hWoXzLPvsZ3gVuEEp5ubkct2xncmf9QTr8kojTJYVPgSlFrJ6z2SCKkCA+yOywR0+I8AeghyPNj2djHbC/4XKmshNaGHRmegd7zYBF6Gt9kPw+/dz/Cl9a3QDkkghwek19m02IjIUXUe3kgnKpifgVWs5D7hRRFqZYoYJYDs0Y5ugDA5V2fknGw4rPh89+o1NeLc38JyusDx+L/BEkFity4Alr91tC4rKyWXFBwbw3mvT+clVMyp9zquExwnoIE8v09L5wPNr8vjZgL5Ry4SMXDOdfFtQxKqqW1RaykPLPuMKAgymjA4otvMIIf7kuptXvgNAb3YcLg6bW3fi4D3opXgkcBnwhWXxwtad/OLEruzeGbsG047DxfjlCPBqRDZ4W6CEAHfgYy9+/ojFcej0tKFR7ulDL9XTiV7mYyLaT3EmTZsVMXzM/5ZXv60p6rxZrYhkoMswPqyU+irKZe3RxrtInBLo7TzuO1FEVojIir2HS2pmsA2AbR36pXoI9QZn53/RuBvpkjmLFq0uoUvmLC4ad2PUhLn4TvH6k6Oau+JtoieJ3cuaFf8FwiPoInt8OHRDxwf1tV/9XO/rgGNcfUK82Lakwi8Srb8IwFMrcwiGLGZSxqPALhQZzEY7fN30pnJvtjV0ad4i7IjjRxnWtAl/R6e3gfYgbAE22IEgHTofS3ivjfD7dmrVCqzDjEbZeSNl+LkL7YA+mwA7WQRk8DI6+/skdDHF8WhxOxxtlnJCeYPEdnAH6ZL5KJdd+wtunbaU08+4qsZ7kVfLwV1Ffo+ObpoZ4xrB2/QVdW+klJqD7khCr84djfHFUCWS3fmnPBy2BglZJcR6lpBVwrOLKc+7CHTOjFrC41HgVp+PpUqxytY6vkZnaa8CdkHU+lXunX6sqroAL67J4xooz9N4FbiGMsq4P0K7WIN2CDuUAFO58ZTwfadjEnt8+UqKcnI52f7ejugC4ce3awvAxNa7md7sQUpLKpssm/mmsudgAZayyhe5+znCazyBxUcE6MJo/IzE4gqEl9mMxQ3ocFqvulX/QOtk0f1pbdp144bbIq35NUudahYichy6ZONUoKmItLUd3bh+9qM1iPYet3B+s4kVozEYahnt9I6+w4zmFE+Eged0jVoXqnZwFiQv1tjn7bpSNuUlPJo1rfT6QoRjIsxTl6Ot8u7quLGIVVX3qZU5KKWYRkWexki0MUZrF5sJcCFQgHZy90FnUuvdPeRzX/YWxs5/l6v+/TL7SrRFIr/oAP9ctTpMW3oUXaD8+TV5HGrTkVHHHc0552SSkTHUvp++b3P/mRzdPJ+i0iNcQ2TeSAZ+dhBgCTM5AsD9WGTwMLAeXbdqHFogjEPXZV2Lds53RC+bkVaTEkSmM+Si+H1Pqktdm6F+gBbD89ALvvMC3SR2L1qnykP7LSI5Hdhi/BWJYZzbtc+AodeQkfEAXv+JtVN8vNfHEqJnxwJEasZf4ZA/6wntB/GgeYsm6EW18rPA3TRvWdm/8NDIC3l7wo8qvRZcdzUhEd5AG4QuRLtff4XWAPZRuS1tUWkpE197M+xnLxPXNMvi/1at5vnVuZwH5XkaTtWvaUAGZQQYYzuX+6P3njlo74MTsjqBQ9YXfLNnIN/s3cePX/kvIaXIeu/DsN4lOfb9XwVQiqdX5uAT4aFHHubo45qh/QUDEX5Oz86F7DxURMAeh5v7KaMphVxKpBApw89JRA+xbYUuqCjoPJEK4QT9+UGPk+nRb0yl301NU9fC4kvgAo8XaAFyAVpbXQxkikh5zTERaYPePNSTfODUYhXmkzllUpU6tBkSp3vfMRx3SrdKO8yMjKF1Eg7rELnQeuHvlImIdh57cd6o36JLvA0kfEEaCHxvn0+MSEfxx+h98jx0pad7Ca+O63zG3Uc80sTlvPzAYHsxzwNGABcTvgBfBgTIYRGKDAqAn6JdpX9EC4oT0JH7nQjwXxYB+fv3Mm/NWtbuKGQt2sdybtOmDPf7uZgKR/fSzVsIhkKMvPAStnyzH239XoniH6zaXoYCzgPPcZ9DuDEM4H5CZPAfKvtZoCLEtjtabN2FdtRfhs7p2MSYJAtQVpU69Vkopfah+3mHoXPw2KyU+sD+eTE6Y3ueiPyWiqQ8gTBDpCEGxrld+1Q1HDYRjn3/SRL1eLgX2pi9JXz+qKdOP2McX635kE1ffYZSt+KUrxAJceJpwzi931UJj73cUawUu0uP0Bxdoekq9KLzmc/HgiYZekh79npGPHlVqVUK9peW0hq9GHyA3mvfF/H9M9HW/sHAFQR4melYBNBL9UXohddHgIfLS5JfToC/Lf+SsT4fz1oWN/r9ND/tFBasXV9+/5nAWYeKmb4yj1272gPu7OxOZFBEN7QQG24fPQCIX/8dhKwQLfAqKVLGB5XCa0Gb/9pT0RMjvMxHIOO2OhEUkBoHd1yUUiERuRxdaesJ9G/jU+ACpdTWlA7O0OiIl3SXrFM8EW4aDduyEivxESu0NBnE5+PKG/9iC775LsE3PmnB53YUL1u1mr52NNNYEVr36cm6XbvDEuweX76yUsSTV1Xdx5evZM+q1TynFDcCP0ILBK8Q3XPR5iNdlHAXFtvQoasb0D6LOQSYzUwOA9oJ/XoQfm3fQ7fTzeNKny/cZxIKMX/rXsJzIvYRoBdCkDeo0HJAm0rOQhjS/Xg+37SDDw6X0CcUQi/+Fe8ZfEm4wa0ErZftQvsxjqeis98R4G56DqjdysNu0kJYKKUqRTkppfagdcef1v2IDAZNTfegSIZ4eREOXqGlybZULf/OGhR8RaWlzMvJxadUeZObe5Wi/5o8SkN6nDf07cXkt95lfeEulrsinvqtWs2Y7qeS2aZ12P2eX53H57Yzbho6FNdvv7vJQOczH4W71Lizc9eRUQHuYjSHwgTBWHRFpv5ol7JSqrLPJBRiAYfRpqx9BLiaEN0JkM8QvAXXYODDLcUUFbcG1Q5d0uNMdC3WPejGRPlU1HZyyo7vQUdD9aOi+94/gEJati5h+OXTY/8SapC0EBaGmsc4t2uGWEl3mzYMYcniaRRs/TZu17tkcVqqWoX5Uau8QuzQ0qpoF4mwdmMZvYje7tVhbk4uJ4ZC9CXcn3BpSNEUeG51HiXBINnbdzBOJNznoBRT3/uQ/xt7edj9IiOjrvL7adu3V7lwdPpjrESF7e51I6NHsMp7fE8iwORK8fv3onWOX6EjoNy+BwenEu37PIhwGj7ew8cS/FDu6ziKCofwQSAYUhw5cAj4J+GhsU7xwFXo5fjn9qfboKvWbkQXAoSKjO0BHHdye8ZNeBZfoO6W8DpPyjPUPsGdxrldU0RPumtCMNia1Z+vYkf+FIoPvs2O/Cm88/JcFs+bgoroi5AsKz4pSKgPSKzQ0tqiuHA380ctBPTfmld/E0er2KxUeZSSw0x0/+XzLIsX1qylOXBPxO7mfuDLgp3kFx0ov1+0yCh3RNXcnFyGoDydy0M4YkdGnUqATQzF8rzOccBno/f3A4FBfj+D/H762P9e5fcT4C0CzGKR/X0nonWDS4H9IqgmGagmGViBphxRLShhJ5VDY79BF91ugXZkt0Fb3zug3bPhSYNOgmTpYVWnggKMsGiw/FeNTPUQGgTRk+5eAA7a5acrFoCyso/ZvHEr61dXX1Ln7eqKUtGbTSW6gEaybUk2Wa2frNbYnF7dTr/xyFa4c3Ny6WpZnI13VNBZQIFS/EAphka5Zhgw9b0Py+8XNTLKFVH1XUEBy8VHH1rTm/b0wE9voA8+PgcyOAaYS4C1fEZr+tCOPvjoQ6A8w/xj4Fmfj1XNmqKaNaVls6YMOOZoQiK8BIREWHzDeJoGQozmMCPRfwGb0fFMM4EMv58XLh7KyUMvY7/VjhKeIXr71OX4/UH8fidb4EH7bumV7GnMUAZDDMIr0YbQQuJp9J7zSWqz692KTwrALqjnRbTsafcCGum78HfomnSnw3ioGXMga2JY69VNe/ay0+fj+1Co3J+g0PVZW9v/3od2Qq9Bm3v22de1tZ/BAvbvKKSotDThfuMPjx7Faxu+4e6P9lAWvAwff6KE47H4M7jKvJfwiv2vhehQ1Cdp3/Rslv1krOf9H1++ki7fby/3Cz27ag1+jpT3674bWIQOzH0AuDAY5J6VuXy273hQLYhdqmMrHbuexuHi/RTtnYo2gO1Dx1OdQoVT29nbVy/Zs6oYYWEwxKCiEu2lwP+gY1t+j84hTe3OL9EF1I2jASTV6TAOTgmQ/A9yyo9FRjI5PSI+CwbLS39vQlvwHa4X4StgpcskdaPPFzUyKhqjTvkBr2/M54utf+JlFFeyHYvpRG8zOgVYQ9fmkSafirHPy8mlj12J1omSGg2VnOOPo/86ZgJ9txdymIeBZ4hVqkM4wpRjDvDwxn3oelBBdO5GZNmPucCRmL1PahMjLAyGGDiVaDd91RPLagd8gl5wHifWAlBTO79YzuRkFlA33Ub0j9ntMNH+HGGf6TMYtSQ7TLtw49aCnIzoyM549yrFYCp60EHVHPY+EXp3bMGx+cLIkGI0R3hNvqbMdyZBazrhbUZPBcbS3H8WN53WodK9ikpLuXbBIo61dOHzv6Cd30oppkdcOw2d5f0H4LdoX8en/IEyphG9J/bdnN42xGXHHs0T24pg3zb031gJAa62+2CMQrvUbycj4+M6TfZ0Y3wWDQyrsP4UrqsPOEl3Ldu0RGfPOv/Zf4ZeAGq+zIcbtzO5LnBChd95ZV5SjvuZy/ppIRQFdw2p4X4/A4juy7iX6D6JRCgqLeWFNWuZbo91JtDcH+QPvUO0bj0FHcg6A7gWR1Cc0+Ugl/erPP6nVn5J/oGDfIcWBI+hf+vnRhn/MCrinVYAQj5+tqI3FcMIz4wfwOltdzFnzEVMWrqCsn2HcP7GAjxc3gdD/839AZ//hZgVkGsbo1k0MJSC3MmLtb3bUCOIz0ew7BDhZqfrgDeI7Ltcm13v4oWq1gTJ9udIFLcW5HTGc5vPDh0pI2gv7u7sbgcvk1o0okWI7S4pYemnH/GHGU7S4V85IRDkptM6cG6Pc7j59bfCEgWLSkt5MXcdi9CFwwejI51eQxcMGYD2v0TSGdjGpTThTbsh0ywstgBv4bRPFTnED09swh/69WXOmrV8UbiHDH8z9N/YPgLMjuiD0ZtmzVvUaOJnshhh0QBZu7Es1UNocFRuuepDVzp6EZiB+DbT+ejTa6TMRyRO5NGE5bVfWTR+f47qO+7dgiNWu9SqEJl34jDNsjhr42auv+Nn9Bj6T3r0G1s+n4HO3p345ubkcnlIlwIZizZBTUPnUTQRwVLKbrDTngojzSHgME14mzHoelKjKWYhv0AXLG9KRtMHGdbe4g/9TuPwUZ14fvV7vApcbZUCnxJgQ3kJkopkwr6cEAhy15DonQ5rG2OGMhgSwLu6rA+4iowMP5eOn84Ntz1Pj341250skupEMlm7C1ARi2gk8fpz7Ny+lr/NOI95j13PulWvJJRPEqvIYWTxwOoSr/jg819v4q4hX4YJCne5FCfk2DnmVGqahjZBdQTG+f38+Iw+tGx+FEeAwzTlIE9ykDcpw7K3ExZOT8GZKJqykHZNzqLfaTN5/LpjmH32aWR07haWfT/GJzTldgLMcpUgOUwGj9DMP42bTmlH/qwnuKnu3RWAERYGQ0KkQ3VZJ6ehythtdmNFQsXrz6FCxyedgBhNIOQXHeDp7BzmQcy8kGTYtGcvORkZ9EWH4w633/sCqzMy2LT/ILJadwZ3THpe5VI8TVnY2oWdx3KopMjudrcbPz8jwIWcyhHWojUR92fH+X3c0Ot45vbszNlFpfj8gUp5MveEQvjZzWgOh312NMV0a7WDy/sPRPx+JGtitfNkqoIRFg0I49yuParScrW2qOk8CTex+nNo1+5vSSYB0WvX7pD13odY6BIZNZV1/tDICxnbsztX+v1sQWsVW4Cxfj9X9uzOw2MuZ9uS7PK6W5ELtiMI5uXkVkp2zEILi0PAmZbWILSZ6Qg+9uPnIFvRbUAjq+CWJ0oe0SZif4eulQRSR/SCPDOiSehMFHsO7efgkSP4O3RF/H62Lcmucw3DCIsGhOPcLrb7BBtqFqfI3g23Pc8tWR/WiNlJhUKsW/UK8x67PiHzjhMZFZkxXVNE06B0xaRT0RFEDvH7jEfrn51fdICcAt2H+jFgcpyscy+8zFuJZLUHOmdW6u/t3slfYll0tbxLgZwBXOD384lSZNr5INoTAadi8QOImpE+OBTixe93h5m+IrvxReuD4Y4I83foighI1sQ68WM5GAd3A8NEQdUfqlLRtrhwN90iEuCSIX/WExDDnOXVn+Nw8UFU6E5gMpX3lzoBsUWnDmx7IxvxV/TLiFXkMOu9D7kSvTMfha70mmzFXK8eHslktUdzht+rFH2BIU0yOFAW5KgmGQjCwSO6FWrXli0YdKiYZ139uRW6LtQQtBFvADq1DvtcM7+f5hmB8qgur3E6tagGAH77ejfuiDBH2AV35nPTaOqkDpwRFgZDiqhOmGo8R7UX/k6ZBHfmk9X6yZh+i8gy5fMeu54d+cfjbYjQCYi3l8xkG4Ql5EULYX3ii2xyCnYy1z6ehdZb/mtZXJRgAl60Hh6RWe1KwaGyI7TMaIJI+IIbS7Cc7/Oxp307VhXs5KqePVDAP7JztFA4eIipLs3vUXTB8UHAs+js9AfQetg3aL3sRyIsuO7q8udyxnmuPT6/+FCWRWt7lru0asmC666OOQdgl7HPmsgEasCnFQcjLAyGFFHdMNVo2dKxcO/8E6Wi5EnlDOSMjAeYcswR7QeIoVU4TLMs+uauq+QAvhQdiBytplUk0Xp4RGa1P758Jf/IzuGGnt0r3TNWuRSloGhHYbkwCoZCNEf3dS6IEDCfo8v+vYbOPn8MncA33r7eq8+IuznUP7JzyJAQq+y5+Brdja+otDSu0HRrGBOWT0TNmFNrWoYRFg0E49xOP+KVzYgXphqrvpT4/UkLCtA7/21LsskaEVu7cOOUPNmycShlZXfiTkAcdu6xXNqmuFKyYLRd+yF0O9TI5qFO/kI7l6kmGon28IjXQTBWuZTHl6+kKCeXkZbFJZbFR0pxEVCKNhcN8vsptSzaortOOH6Ke9BmtcFo85MTPus1Rvf4xitFR/tad5n5RE1ygc6Zeg3ImhjTzFgdjIO7geA4tw3pgQqFeHXur3nrpdnsyN9K8cGD7MjP4z//nsG8v44jFAzGDVOtrcqiyWoX0SLBRk24lVltivH7Krf0c5f3GOT3cyY6jHUYcA7eTtzzfT6u6tmDqRcMi5qXAYn38IjmXI9HpPP5Xjv57jbgeeAzoFSFOFuED9F+iq/QJUD+Djh1Y68iXHu6KBjk6ewKX5N7fGPRkVYO8crMe+FoGROWT6yVSClRDbClWq/OHdWCcWNSPYw6Jbgzv9ZtlobEWZu9kDfn/wm9j+6KrvnjVBGdRpfMVvQfci3vvvIcZWUfU9m8M5SLxt1YyQzlRL9UVbNwcMJvq/o306JTB8a/oUt+xypB4lSb/dyuNjsa+AA4qmlTz7axfbt24YT27fhHdg4T+/dl0qABYVneQNj9HL4GzgoEePPHP6RN06aVvjfyfCwcreJZlwnterRG8Rm6yOG1wDt+P80DfvYfKaOJCGWhEKOAv6GFw2dU7sU9wOfjnZuuq/QcX6O1kSXo9kcAv/T5OKlf76Rb5FqF+WQO71+lysJ/vuP4lUqpgV7njBnKYKgFlr3zN3RAZSdgErom0BbgOOB2duRPY9k7TxEK7UD3WBsP3ALkxa0vVRP1oQKdM6uVr+E4tJ2xRCvbEakFLAZu9Ptp6+FDcO5z6dx/h5mO3FFPChKKdoqlfcRafKNGSKFFvdOvYgbwjgiXnXYK/1qdV96r4z60VuFu+uQe46BQiKezc2ji91ca3whguN9PS1cUVDI1scq/p1Mm25ZkM4Ga9WEYYWEw1AIH9u8FMoGW6L3m79HLzZfAr4EOFO2djqNtiNyN+J6jY5fTGHiud32p2oipn7B8YtLaxcBzurJtdnaY0PIKY41ZpylK1FOk6ejplTksWLu+XHiceXRX1sbp4VGV73V/f9QIKbTIvxa9++8XDPJi7jpeBcYpRX/7umy0/ni+/dkDaP9FW/vf277bzM7iw5XGNxN4NyJqqqrUhg/DCAuDoVaw0JH3AXSjTsfMtBzoBnyEO1xWqVEEfEMZeO4PY0ZA1WTVWUe7aNGpQ8KJnDeNBskaHWZCiuZITraTn5fjuv+aPK70+cqFR9tOHZg96uKYY3x8+cqkOwg6uCOkDh0poywUQqF1xNZAF7TPJYDumzjKLjR4OToaaihQRLgzOARcgNaqvgb6HzjIMJEqjS8ZnFDpmoqSMsKiAVCb5R8MVaP1UUdTtHc3cCfh/oin0VpG7VV1TZbbS2Yyg6p3zosWxppoJz/HhNW9Q4dK5S+UUmGlOBJphFSVDoIO8Tr8Ob6P+eOv5MoXFzLD1TPjTSAP+JXfT9u+vZg0aEAl/8fJwNFKsdzn47xm3s9QFdNTNNwaxgSql4thhEU9xyn7YJzb6cWQi37Om/OnUzk0dovHMQfvcFm3M7mmCXTW9u2Bk7smlP3/7GK4ye65DbHDWON18nPKdXRr3ZrP87eT/X0BX7oCbrwiihLxO0QrgQ7wm7ffY+oFw2KOye17ieb7uPPdD7gsFKrUVtUpNHjW6jzGdD/V0xz2BnBWDZmbEsGdizHwnMR+z16Y0NkGQKwOZYbU0KPfWJo0bUPl0NjjPI45xA6Xre3GR4mSt0tHYVm7CxIOY/Vibk4un+dvZ9H6DSwCRCkOoU0zTuvVqRGfSTak1O1LSaQcuvuaaHWmJlsWuTsLy3MoysdGRRnzUUqR9d6H0culJ9n9ryYQgV6zR1e5Yq0RFgZDLSA+HyPG3I7ffx/hFVxrvx1rrP4RNcGKTwrInDKJ/YdL4hbtizXG51fncSUVWc6XARf4fXFbrya60EZWvJ2Xk+tZ/Tba9U+t/JKzQyEOoZ3VOfYYniJ6scCzqNAucncU8mVGgPOaNa30WtMkg+9q0NyUCP5OmeWaZFXyMIwZymCoJXr0G8uG3A8iMp+bIJIPnIlS04nXjnXgOV3pNTv+/2y3+cQrMqm6RGajP9/heHodWcvgkFUlR+3cnFwuDoX4D7DSPjYTeCMUYsF1V3Pf+0srtV51k4hd3+1LucSy+Arv0hte118UCvFS3npaZgQYHrQ4aFkM9/tpEQiwt7SUZugkQ4cD6J13c/v5fwycK8JJPXvU2O+gpnDqSWWNSC4XwwiLek5VCsoZ6gavCq6tj8qk/9DfgRKyl1Uci9eONZ4JyhEQT2fnsCBvfdQSF9EYuWY6lwwZ7NmyM1p13P2+/6GpEs5r1sTzntEWdGcHf00oxKVE+iTgb19kx/V3xCPSl3KvUgxG126KVXrDub5LKEQxMKbHqby2fiOL0MUAh5/8A17OW+eZcNcXaNK0CatEyoVcTTqrawrHh5Fs2RcjLOoxTj2oqmRqGuqGyAqubk7vf2WNfEdYjaE1axlF7B10JIHOmeS/l41aks0EqBRmGa06bnFoFMp/Jr/vLYwZcGbC43W0iheo0CocZgL98tZxy5n9k3b+RmpXkb6Uy6noox05N+7r9wHPAYuAq3PXMc7vL5/PBXnruADvhLvzfb601CSi4TRRSlRgGJ9FPSdzyqRUD8GQYtzmk0tDIbrY4ZzJOIMde7bTtnPCcv1q0alDzOq4h63p/Gvj3oQLWTqC7ahQiLPwtvufreDRz74o97sk6oMp165W5nj6UrLQDuh9EXMT6ch+FO1HGYwu1uIO3QUdnjDcfp2JLiqYKj9EdfB36Fruw5iwfCIDz4ldPsZoFgZDPSbSfDITXWpiOlWrXuquN2UV5jP+jXE8WfQ1Otx3HwGuJsgCdD4yQG8K/AGUSqxkupOotwz4noosZzcHANnwNYesUHmJj2g+GEebuPu8oS7tKi9q0tsgdOmO26jwq7hLiDhRWJ/jHbo7zn7/FXAder4vqsMw2NrAycWI5xszwsJgSFMScW57hq5SYW5JNJHNC8e23SmQRyGrCbAKH+/h5xGs8iLja8holplw975Ne/byZUaA3aVHCACR+/AAUAZghcr9LiGlPH0wRaWlXLtgEd8fOEjWex+Wa1fupLdDR8oIuhoVKeB9YH7TJvhEyn0K5VnbZUEutiw6orWQzyLGNw2tcWy37xOrB4eTA1Wdgo91hZPtHQsjLOoxDbBgsMFFz44FiFQs2pFEq4GUhV7QxqBrGFW3hMSEM05i2ofTwNrEKyiu5BEsbgea2eG+N6L6nIBakh1Xu3ho5IXlDX9+YleVhfBKr/1F6AGMVCpmFNNTK3PYfuAgrwJXFezkcfs73ElvQFiSnfPdV3n4Fpxs7XjFAM8AnkE3OBpH5R4c1u6CsMCT4M78alcJrgviBVEYn0U9xbERe0WvGBoHkbWX3Hb/M9AVTGvClj7qlB/QrVUBYzjMSOAKgvj5Oc39Z3Lx6UH+dctxzFzWL2ZyqON3yC86EJbLEOkz2AdsVop77J3QvUqxWSlPP8OLa/K4Bsr7Qbxuf5fb/OaVZOf+7mjz6S4GeD4wEOgvQh9ghQjH2t87RoSrevYoj94K7sxHWRaZUybxzKA5PDNoDplTJqEsq96X5TGaRT0mc8okWJbqURhSRbwaSOd07VLtEFSAg0eOsPtQETPQC/j9lPIfFvL7IecwtnMbtj/6JBOAbURvrOQs2lNd5iJnQVdQbkq7B93zIl4UU0nQQinFNPu6e9GawK/Q3pRplsWZObmIUG7CKglaYXkU1y5YxItXj6nUF9trPlsoKDpyhH8CNyrFf5zvVaqSmW/+qIUUL6sozDhzWT9ajFrI+DfGaS1DAF/6axqRGGFhMNRTakIQJMLcnFyuiPCLXOX3UXjwIBmnnwZoTVcpb/u8s6OfB9xYsJN/2cenWRaDVueiFHxhaxVefoIsKgSBIwSCoVAl5/OlVDivDwFYFpeKlAumF9bkscrWWLqEQuQfOMjT2TlMOXsQkFib1TzLqtQ//BLLYl5OLhNP6ErmlElhgsKhuHA3zwyaw02jIXP3l+TPeoLgzvy0KeGSCHVqhhKRS0TkPREpEJFSEdkmIvNF5PSI69qJyFMisktEDonIuyLSuy7HajAYKrcYdYgMy3VCb71wnPB5wJVQnstwG3B+0KKrZZX3rx6MdzitE8XkBzpZFihVuW4Uuq3psKZNON/nowTKzVnTLAux+1zvoyKP4sU1a+OG5Dpz8GvL4jEq9w+/Vyk9F0fK4pqFn12sNY1nBs2h24j+BHfmJxx2nGrq2mfRHp2H80vgYnT95p7AZyJyPICICLr0+0j039M4IAN4X0S61fF40xalYFuHfqkehqEWOfb9qhV8q0li+UUSqdEUudDeYx9/FB1NdBy6Du+wpk34l8/HUqAfOhvaefVDdwR51udjWNMmbAXOw1uonOfzMar7qVhKcQ14VoV18ihGAiPtznXx5uBypXiNcKe3+3vPtIK8HkyuFteMAzeTO3kxStWPNgN1aoZSSr0AvOA+JiLLgfXA1cCf0SbLocBwpdT79jWfon8vv0Nro40aZydSU+0SDenHTaNhW1Z2ys0U1ekN4YS2Ri60Ti7Dq8ANaG2iZ5TopMhWrY8vX8nr2TmsoXKOxkHAJ8K3320O82c4TEMX+gsBX9jHZgID1qzlZ/37eoYWuyPOpgG56GQ8C62htLfvt0/B1u+acGnkoOKw4pMCmLyYXrNHYxXmR418SwfSwWfhGPjK7PfRwPeOoABQSu0XkdfQ0YCNXliALsmAERYNGnc3ulRRHb+IE9o6FcIW2iLgIvTOfgQ6iqnJrsp2fq+CiJv27MWK0jSoJdCzcyeWbc3nXLzDXvsDhYRrHJfFSFx0a1YzI879EjgNbf74ZcDPt/4M5j12fUW9r6HX0L3vmKj1vhzcAiOd/RgpERYi4kf/7o5H12suAF60T/dE/11FkgfcKCKtlFIH62SgBoOhSjihrcMhbKEtQgsMpxeE02HuuLZtK33eq1VrPOH1+PKVfLtlGyvQOect0FqA+P00CwTYX1paSSO5zyOiySGyzao7wQ/gMxFebtGMvSUWwXXLKVZP4RRafOflB9iw5gNG3zArIYGx1o6YSlcNI1WaxeeAI8a/Rpucdto/twc2eXxmj/3eDq1xhiEiE4GJAMe0almTYzUYDEkyNyeXo5ViLbp+krNY7yktrRQaOxZ4fk0ePxtQYQqK1qo1Fo6AeQftEH0JXSr8bXRJjktPO5nvc9fxaCiUcEn1WMLJSb77pHUTfvnCdsqs8EKLZWWj2LxxKOtXL06oVW5x4W5tMciamJYaRqqS8n6MNlVej95svCMiJ9jnBPDKTY6plCul5iilBiqlBrZrHlnwzGAw1BXOov0G2gz0GRASYfjJJyLAfRHXTwNQiqdX5oR93l3AL5GCiI7JaC6UV969lIqSHJ9s3qq1hJpqRhTSyXfT3t0TtdCi7qs+P+FbPruYsEgpJ1oqHSKmUqJZKKXW2f/8XETeRGsSdwA3ozWI9h4fa2e/15+yjrWAtbvAlPkwpDXRWq2+8/W3nIO3L2EwsHTzFqacMyhmq9ZY2oVTd+q90iPljWunoSOq2mUEOKNd2xrJTYks51G0bxt6CRuOju06Dt0R8Tqi9VWPx4wDN8Mg3X99yqlLyvMygJglYGqTlDu4lVL7RORrKjTTPLQWGcnpwJbG7q9wSgnMNM5tQxoSrV7VNMtiUSjEuqZNOS+KjaBvu7YxPx+vIKJTd6ooJ5eT7c+fDIzz+2lbQ30mnAU7c8ok/qtG8s+l3xOygujMjbvQnpLVaFfsf4CxMfuqx6O4cDczC/vBoDkMPKcrPTsWpMxMlXJhISJdgO7o2QYd4/MTETlPKfWhfU0bdGj086kZZXrxXzUSHRNgaIjcNeRL8rOeiGN4TU8i8zIc/MBQEU7q2T3mov348pVRPx+vIGJ1BE083NpERTmPAtbnLCJY1gn4BLe/QhvChgG303/Ib6r0nZGs+KSAFQCD5pDV+km2LckG6k7TqFNhISKvUFGjqwg4FZgCBNE5FqCFxafAPBH5LdrsdCf6v86f6nK8BkOqSJWpobpUJy+jup+PJaiqU3nXrU3MXNYPCivCfFd+/BKWlYWXv0Jb1ieC1Lzd2DFTOWXsy01UtVjdtq41i8+A8cD/Ak2ArcAHwANKqU0ASqmQiFwOPAw8gZ71T4ELlFJb63i8BoMhCZLxCXgl3VXHp1BdQeWF05PimUFzwop2qlCI9TmL2Ln9K+A3wONU+CmcuKHeQHOyP17A6WdclfR3J8KKTwpYMWhOJd9GbZio6jqD+4/AHxO4bg/wU/tlsHH+cA2GhsBTK3P43G6DOuWcQdW+X00XVnRrFJGCYtHcyWzasBkVepLKfoq5aIGxBjiuSg7uZHH7NiYsn1he3bYmtVPTz6Ie4Ti3V3xihIahfuMk7Qk6xyKRPuF1hbW7oFxQzB+1sFJxwHVfvsq367/Fsj5Fl647xX5fCmxA5xeXoIXHoGo5uKvCM4PmoGbMKa85VVN1p4ywqGfM2jAi1UMwGKrNUytzUHa7VHHlWKQFIQs1QzcuKi6sXIZk2Tv/QKm7ie6neAjt3D6JQGApA4aNr/UhR+Lka9RkdVsjLAwGQ53iaBVXUdHlLt20i1gc2L8dbXryojfwLTCEQGAjJ5x6HN37xO6jXttEVrcN7syvkknbCAuDIY24a4hujNOQcbQKpx/FNNJHu3CaOMXGD+Vpf5GsAXx0yVzBxVfflFBdqLpgxScFPDNoDrmTF1e5zWvqn8KQEMa53TiQ1Z/p8Md6GDabCG6twqs+VDpoF2rGnJjl/1sf1Q5dCrEk4kwJMA2fH11xts/otBAUblZ8UsDMZf2YP2ohQFLlRNLrSQxRcZzbXjZUg6G+8NTKHEJRutyRJtpFPIZcdAsiO9B+iYVop/ZCdMeOkwlZc3jn5bksnjcFFVGlNl1w2ryqGXMI3TcnoQZMKc/gNiSOdm4bYWGov3y8eUtC9aFSQaJmmR79ruSrNR+wecMaLOv36HJ2fYDfAtcCPsrKrmDTxqEsWTyNgq3fJt3joq4o16DskNtYGGFhMBjqjJPatSWntDRq4lzfdm3rcjiVyJ28mBWLY5t8xedj7I8fZf3qxby98EGCZf9Ah866aUaw7E5yPv8lqL9SlR4Xdc0zg+bAy8dHPW+EhcFgqDNqOnEuVYjPR49+Y/ng9VkEy2JERikfOrtbV6MtK7uVTRv+mnCPi3QivUSbwdDIcYrDGeoHOuEuWmRUDnAIXd7Ost/vIxhsxYqP/l1HI6w5jLCoB6RD4xND7ZPV+kmAWisEZ4hOcGc+3Ub0T7o6woCh15CR8QCVI6OK0V26fwBkofv1ZQEdgI3s3fVV9QddxxhhUQ9QSttSTSRUw0f8/lQPodHySLO7kv5M975jOO6UbmRkDCU8MupUoDO6dHlkSZAuWBFl1OsDxmdRTzD1oAyG2qE6OUzi8zHmhtmsX72YlUtnlUc97cg/BDyKd0mQLFQoduRROmKEhcFgaNSU5zAtq5rm7ji73Q7rR/7QDxWK4fiuhR4XtY0xQxkMhkZPTRfobH3U0cQqCaLP1y+MsEhzjHPbYKh/DLno54hMx6skiMh0hlxU/8xQRlikMdbugnLntsFgqHlqqtdDJD36jeUH3U/C7x+M2/Ht9w/mBz1Opke/MbXyvbWJ8VmkOd1G9OcZ49xu8GS1fpJtS7JrpR2mITbPDJoT1le7JhCfjzE/ruz4HjDsp2lZYDARRMWvx1vvEJFCYHMdfFVHYFcdfE99wMxFBWYuNGYeKqgvc3G8UqqT14kGKSzqChFZoZQamOpxpANmLiowc6Ex81BBQ5iL+qcLGQwGg6HOMcLCYDAYDHExwqJ6zEn1ANIIMxcVmLnQmHmooN7PhfFZGAwGgyEuRrMwGAwGQ1yMsDAYDAZDXIywiIGI3C4ir4nIdhFRovP3o13bTkRmi8gWESkVkW0i8ozHdWNFZJWIlIjIZhHJEpG0r0udzFy4PnOOiITs6yslgDbkuRCRo0XkARFZISL7RaRQRJaIyLlR7tlg58J17c9FZL39/+MrEbk5ynX1ci68EJEOIvKoiHwrIodF5DsR+auIVMplSHR+UoURFrH5Oboo/auxLhKRdsDHwIXoDicXAb8BDkRcdwk69/8L4FJ0DeMs4P4aHndtkNBcOIhIBvB3YEeU8w19LgYAPwQWAVcDE9CFgj4QkcvdFzaCuUBEfo7+e1gIjAReAp4QkVsirqvPcxGGiAiwGLgeeAj9PA8B1wGL7fPOtQnNT0pRSplXlBfgs98DgAKmR7nuSXTGeJs491sFfBhxbBpwBOia6uetiblwXf8HIBeYaV8faExzAbT1eOYA8BXwUSObiwCwE3g24vj/obOaMxrCXHg896n2nEyMOH6zffy0ZOcnlS+jWcRAKRWKd42ItARuBJ5SShXFuO5YoB8wL+LUXCADvetIWxKZCwcROQm4C5gElHmcb/BzoZTap5QKRhwLAl8C5QWgGsNcAGcDnfB+xg7AUKj/c+FBE/s9cl3YZ787629C85NqjLCoPgOA5sAOEVlg2yUPisirInKi67qe9nuu+8NKqe/QDXtPr5vh1gl/AxYopT6Kcr4xzUU5ItIEvTCscx1uDHPh+YxAnv1+eqzr6vFc5AEfAVNFZKCItBKRQWhN6U2llPN3kOj8pBQjLKrPMfb7w4AFjAYmAmeg7dOt7fPt7fe9HvfY6zpfrxGRG4CBwG9jXNYo5sKD6UA34I+uY41hLqI9456I8w1qLpS2JV2GNj1+gfZhfg58i27I7ZDo/KSURiMsRORCO1oj3uuDJG/tzOF3wLVKqXeUUs8D44HjgBucIdjvXlmQ4nGs1qituRCR9sCfgT8opXbGutR+b7Bz4fE91wN3APcppZa6T9nvDXkuYj1jotfV6Vx4UcX5+QcwGO2nOM9+HwgsEBFn7Uh0flJKY+pn8QnQI4HripO8r1MI/117JwGAUupzESlCaxgQe5fQ1nW+LqituZiBjn6aLyJt7WNOx/qjRKREKXWIxjEX5YjIFcAzwNNKqbsjTjeGuXA/43bX8fYR59NpLrxIan5EZBQ68ulCpdQS+9xHIvIt8DZwBTpaLtH5SSmNRlgopYqB9bVwa8euGG1XEIq4rifwqXNSRE4AWgBra2FsntTiXJwO9KZCgLrZhf6PMZbGMRcAiMgIdBjkK8AvPC5pDHPhfkb3YujY4td6XJfSufCiCvPT237/IuL4cvu9B/r/RKLzk1IajRmqtlBKbQNWABdHxE2fDbTB/kNRSm0BcoAfRdziBnTE0Jt1MuDaZTJwQcTrWfuck4PSWObC+RtYBCwBbvCKHGokc/EperPg9Yx7gGXQIOfCaXE5KOL4Wfa709M1oflJOamO3U3nF9q2eDXa/6CA+fbPVwMtXNeNAILohJpL0aG0W9FRL81d112G1jT+DpwPTEEnaj2U6metqbnw+Nx0vPMsGvRcAN3R/9E32c832P1qTHNhX3ez/Ywz7Ge81/751oYyFx5z0wYtEL4HbkFvnm5BC5EtQKtk5yelz5PqAaTzC21nVlFeJ0RceylaiyhBm2H+BXTxuOdV6N1Tqf0HMw3wp/pZa3IuIj7nKSwa+lygM7ajXaMa01y4rv0FsMF+xo3ApCj3rJdzEeVZjgWeRgfAlNjv/wAyPa5NaH5S9TIlyg0Gg8EQF+OzMBgMBkNcjLAwGAwGQ1yMsDAYDAZDXIywMBgMBkNcjLAwGAwGQ1yMsDAYDAZDXIywMDRoRGS6iKQ8PlxEPnAXmBORfvbYaryiqCTY9tZgSIZGUxvKYEgxkyJ+7gfcjW54kxaF4gyGWBhhYTDUAUqptCgGZzBUFWOGMjQqRKSNiPxVRL4XkVIR+UpEpkQUgTzfNuWMtq/dJSKFIjLPVXrdubaTiLwgIkUisldE/ml/TonI+a7rys1QIjIB+Kd9aqOrD8IJ9kvZ17i/53yPe/pFZIaIbBeRYvs7euKBiPQVkcX2GA+LyDIRGVb1mTQ0NoywMDQa7GYzbwA/QTdpugJ4C3gEmOnxkUfRNY6uRxd2G2cfc/Myui7YncC16Oqoj8UZyhvognEA16BbrZ5NeHnqRJgO/AF4Dl36/W1gceRFItIf3YuhPfBz+zl2A++KyIAkv9PQSDFmKENj4jJgKPATpdQz9rG3RaQl8L8i8ohSapfr+o+UUre5rjsN+B8RmaCUUiJysX2/Hyql5tvX/VdEFqO7JHqilCoUkW/sH79USn3tnHMpODERkXboiqxzlFK/cY3RAh6MuPwhdEG+4UqpI/bn/4vu+TwVLWgMhpgYzcLQmDgXXfb5hYjj84Am6N29mzcifl4DNAW62D8PRvddfyXiugXVHml8egMt0SXB3bzo/kFEmqPbeb4EhEQkICIBdCvPd9FzYjDExWgWhsZEe2CPUqo04niB67ybyCgl53NOq9ijgb1KqbKI63ZUa5SJcXSU74r8uT3gR2sQU71uJCI+5dGYyWBwY4SFoTGxB2gvIk0cc4xNV/vdqx1sLLYD7UQkI0JgdIn2gQQosd+bRBzv4PHdznfluY5Hfvc+tDb1OLrHSiWMoDAkgjFDGRoTH6L/5q+JOP4j4AjwWZL3+wy9a78y4njk/b1wtJTmEcd32Od6RRwfFfHzauAQukOdm2vdPyilDgFLgb5AtlJqReQrgbEaDEazMDQq3gQ+Bp4UkU7oHfllwP8AD0Q4t+OilHpbRD4G5ohIR+BrdDvRvvYlsXbsTt7FrSLyLDqKarVS6oiI/Bv4mYhsAL5CC4rzI757n4jMAu4SkQPoSKgzgZ95fNftwEdo5/vTaK2kI9Af3YHujmSe29A4MZqFodFgm1tGAc8Cv0c7sEehF9O7qnjbq9Dht39EO5ubUeEb2B9jLDno0Ncr0ALsC+AY+/Sv0SG504F/2/e8rdJN9Pn7gR+jQ2Yvtu8X+V3ZaEGyG/gLWrA8inaSf5TQUxoaPaatqsFQw4jI4+ge3O09nOkGQ73EmKEMhmpgZ1ofhTZpNQFGAjcDDxlBYWhIGGFhMFSPQ8Bk4CR0DsZ36Kzqh1I4JoOhxjFmKIPBYDDExTi4DQaDwRAXIywMBoPBEBcjLAwGg8EQFyMsDAaDwRAXIywMBoPBEJf/BzK+paGoxsnlAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plot_classifier(X_train, y_train, knn, ax=plt.gca(), ticks=True);\n",
    "plt.ylabel(\"latitude\");\n",
    "plt.xlabel(\"longitude\");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.0"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "knn.score(X_train, y_train)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.96"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "knn.score(X_test, y_test)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Regression\n",
    "\n",
    "- In KNN regression we take the average of the $k$ nearest neighbours\n",
    "- Note: regression plots more natural in 1D, classification in 2D, but of course we can do either for any $d$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "n = 30 # number of samples\n",
    "np.random.seed(0) # fix seed for reproducibility\n",
    "X = np.linspace(-1,1,n)+np.random.randn(n)*0.01\n",
    "X = X[:, None]\n",
    "y = np.random.randn(n,1) + X*5"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "With $k=1$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [],
   "source": [
    "knn = KNeighborsRegressor(n_neighbors=1, weights='uniform').fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAY4AAAEQCAYAAACnaJNPAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAwQUlEQVR4nO3deZwcdZnH8c/Tc2YgBEISRK4QIdyKJGIQEAQ5Ioqwq8F4IaisIkvUzYrXGuIFSlBREc2qXAoSBSSIyBFAWWDQcCccCRAC4QwkJCGZs/u3f/yqZ3p6+qhKHzXd/X2/XvPqdHVV9dNFU0//bnPOISIiElYi7gBERKS2KHGIiEgkShwiIhKJEoeIiESixCEiIpE0xx1ANYwbN85NnDgx7jBERGrGuHHjuOmmm25yzh2b/VpDJI6JEyeyePHiuMMQEakpZjYu13ZVVYmISCRKHCIiEokSh4iIRKLEISIikShxiIjUm85OmDkTpkzxj52dZT29EoeISD2ZMweOPBKuugruvx8WLPDP58wp21socYiIxKESpYLOTpg3DzZtgvTM56mUfz5vXtlKHg0xjkNEZESZM8ffyLu6/A3+wQdh4UKYPRvmzo18usdeXM+Nj7wI19wOU/5tMGkA/3n3H2hJJaG7Gy64AKZNKzl8JQ4RkWrKLBWkZZYKpk+PfHO/6I6nWPjQC9iYfeGgfYe8dvo9C2gh6d9j+fJyfAIlDhGRarjnqde4d8Vr8Kfb4O0fzL2TGVxyG7wxlt0njOa4t24f6txdfUn2fNNo/nbvRb5NI5UavlMiAZMnl/AJBilxiIhUwff++ihLnl8PW78NDnlb4Z1vXU5rcyJ04uhLpmhtTsCsWb7KK7M0k9beDmeeuRmRD6fEISJSBV29SY7bb3t+/pfzCpcKTjqJC049m5/cupxkytGUsKLn7u1P0dqU8FVcs2f7Kq/ubv8eiYRPGrNnl6V9A9SrSkSkKnqTKdqaE9isWVh7OwbD/9rbsTPPpL2lyR/TnyO55Dp3f1DiAN+4vmgRzJjhe2zNmOGfb0ajez4qcYiIVEFPX3BzD1EqaLtrhT+mP8mo1qai5+5NptiyPeN2Pm1a2UoXuShxiIhUQW8yq1QwfbrvHrt8Oey+u2+fCG72bc0+WXT3hS9xtDRVrwJJiUNEpAp6+nxV1YACpYL2Fr9fT38y1LmHJKUqqMk2DjP7m5k5M/tu3LGIiIQR5eaeLnH0ZLdx5Blt3tufok0ljvzMbCZQpC+biEhMOjt9FdSyZX7cxKxZ9L/jQJIpR2tT8fYKYKBk0pNZVVVgtHlv66EqceRjZlsDPwa+HHMoItIIos4nlWeCwd653wEIfXNP96ra1NtPMuVI3nMPyfPPJ9nVTRIjaQmSDv/8/PPp7e6tauKotRLHD4GlzrkrzeyKuIMRkToWdT6pAlOJ9P78Qjjt4qFtHAWke1KdND8jUZ1xVf4D+txAsqmGmkkcZnYI8ElUTSUilRZiPqlHdtyLs65+mP70QL7nVsFHzst5uv6Ev6mHLRW8dccxfOv9e/NGT7/fcNFF8OKLefdPbL89H/zKe0KduxxqInGYWQvwK2Cec+6JuOMRkfq2/qe/YPYxX+KN1o7hLxpw5RJW7dDFy+u7OWLPCX77w8/D66/nPee+fWs4bHK4m3tLU4JTD9l1cMOvX4TOwqPNGZsj1gqpicQBnAWMAr4X9gAzOw04DWDnnXeuUFgiUo+WvbSBmw88iD1WP8NW3RuH79Ddy3ZbtTFj6o6cccTuftsN82BhhW7uVZqDKqwRnzjMbGfgG8BngDYza8t4uS1oMN/gnBvS4dk5Nx+YDzB16lSHiEhIvbvsAsDcW37JtOeWDH0xnQQ+d/rQ7ZW8uVdpDqqwaqFX1SSgHfgdsDbjD2B28O/94glNROpRzwn/BkBrsm/4i/mSQPrm3tHhb+rgHzs6ynNzr8IcVGGN+BIH8CCQq2Lwdnwy+Q3wZDUDEpH61rvbZLjnPlqbm/zNP+wv/CJTiZSswnNQhTXiE4dz7nXgjuztZgaw0jk37DURkVL0JX07Rev//gouvjBaEhghN/dKGvGJQ0Sk2tLTmbce8HY4+sqYoxl5ajZxOOeKr24iIrIZBhJHFUdj1xJdFRGRLL1JJY5CdFVERGDIvFS9F18KUNU1LmqJroqISNbkhL2PLAWg7ZzQY44bSs22cYiIROWc48lX3mBTb8Z44SVL4LJrYasdYCu/6bkxfhqR1vPnwfuOrfteUlEpcYhIw3ho1TpOuPCu4S+cdM6wTaN7NpLo7vJjMpQ4hlDiEJGG8fqmXgC+8b69eMuELfzGWV+EJ4ePId5x/St+4N/y5VWMsDYocYhIw0g5P23dO3Ydy/47be03jgWeuT//5ISTJ1ctvlqhxnERaRjp3NBkGcPAZs3yU4nkEsPMs7VAiUNEGkYyKHFk5o2KT05Yh1RVJSINwwWJoymRNfFEpScnrDNKHCLSMIIB4SQsx4xFDTA5YbmoqkpEGkZqoMQRcyA1TpdPRBpGaqCNQ3OklkKJQ0QaxkCJQ4mjJEocIhKvjMkFmTnTP6+Qgm0cEpoSh4jEJ2tyQRYs8M/nzKnI26VLHAnd+UqiXlUiUhXruvroT2aMzr7vPvj5r8A1Q/vowe0Ov/2IYxh98LSyromRSgWJQyWOkihxiEjF3f74K5xyyb+Gv/DZ3+Q/6Ma17Pfw3Vz/n4eULY4gbwwfxyGRKHGISMW9sK4LgP8+Zg9Gtwe3nXN/AM89l/eYG95xLE90lHeeqPTIcZU4SqPEISIVl/6lP2PqTowf3eafdLwOD92Yd3LBVdMO4+G+HK+VwA0kjrKetuGoiUhEKi7nDbvI5IKtU6fQ05/M/fpmSqbyTDkikShxiEjFJXM1SheZXLBtl51IOYY2qJcpDg0ALI2qqkSk4tJVVYkIkwu2/eMpAHr6UzSXaY4Qp8bxslDiEJGKK9i2kGdywbbmJsAnji3ayhNHUm0cZaGqKhGpuJxVVUWkx28MaecocZR5Sr2qyqImEoeZfcjMrjazlWbWZWZPmNk5Zja6+NEiErfNGT/RFiSO3v6gjaMMo8w1ALA8aiJxALOBJPB14FjgIuDzwC1mViufQaRhpXKtvFdEZlUVnZ0wbx5s2jTYUJFK+efz5oUueWgAYHnUShvHB5xzqzOe/93M1gCXAocDt8USlYiEsjm/9Ntb/G/CD110Ny0b34BT5+fcry3Zx/xfXMa+IRZhGqwyCx2G5FATiSMraaSl5y/YoZqxiEh0A7/0IySOA3cdy2nvnsSm3n74459g9fDbwMbWUVy77xE89lwv+4Y4p3MOM3XHLVVNJI48DgseH4s1ChEpanOqqka3t/D19+3ln1z1fVi0YNgo81c7xnDtvkfQvX24349J59S+UQY12T5gZjsA3wZudc4tzrPPaWa22MwWr87xS0VEqidV6i/9PKPMR/X1ANB16OEh49AiTuVQc4nDzLYErgP6gVPy7eecm++cm+qcmzp+/PiqxSciw6WcK+2GnWeUeXurb0DflC5xFOmum0q5SKUeya2mEoeZtQMLgUnAMc65VTGHJCIhJFNl6AI7dy4sWgQzZvjEMGMGTbfeSmtzgq6+ZKjuuinn1KOqDGqmjcPMWoCrgQOB9zrnHok5JJGRq7PTT+OxbBlMnjwwjUdc0o3SJcsxynzUTTfT/ewLg9110zK7606fDtOmlSeBSW0kjmCsxu+BI4HjnHOVW5RYpNbNmeNvll1dfszDgw/CwoW+qmfu3FhCquQv/Y7WJp58dAV/2/FtfvXAbAmD+dfAlhN55rWN6opbBjWROIALgQ8D3wM2mlnmT45VqrISCWQOlEvL8cu72ir5S3/C6DbuWrc9d534jcI7/u4+AHbZtqMicTSSWkkc04PHbwR/meYCZ1c1GpER6FvXLWH57Uvh+Hw3UIM/LOWta8bwtXQ31ypJlauqKofLPv1Onv+PM+HmmwcHjGRKGBxzDJxzLgDbj8mzBoiEVhOJwzk3Me4YRIAR13aQ5pzjsntWsoNrZ4dEU979nmUUD3eurHricBWsqhozqoUxp58MV182tKSV1tEBn/8kvHmrirx/I6qJxCEyIozAtoO09A/tk3pWcuYf/ifvcqzf/49zudyNrW5wVGHgXbq77rx50N3tP38i4cd+zJ49IpJ7PVHiECnAOcdTqzfSc/+DcMkfYcvtYMusnS75Ixx0FFsdeAA7jY2n/nxguvCDpsHl7bl/ebe3Y1MOILWiN/oblFjSSrkqzA9VYFEoKS8lDpEC7li2mlMuDqZFm3legR3XwR23c/vsw9l13BbVCS7DwJQeO+9c8Jd3YocdcE+viHbyMpS0XLWm+sizKJSUlxKHSAHrNvUB8J3HrmfCYw/l3W/J/ofys0mHsa6rr1qhDZGeadyMgr+8Ezc9PpBkQilTL61kSnNE1RMlDpECXDAw4JCtUuz61D/zth20HfIugGg35TIatrJdnl/eCbNQMSZTjkvufob1v7sdDjgh5/iI0X1dnHzBT2kJkTiqUlUlVaPEIVLAwC/5k0+Ga36ft+0gccIH4e4NA2trV1v6bYvdnM0sZ4/VbI+/tJ7v/OVR2PqtcPBb8+739n/9mikh4ks5R0KZo24ocYgUMJA43r5/4baDvfaGu+8NdVOuhLBraafv3X4KkPz79iX9+X77yu0ccemPh5W0Fu+wFx/6+Hls3GXXcPGpqqqu1NQkhyLVls4DhuWcZI9Fi2Du3IEbciqmzJEaaOMoljhsyP75zxckog8en3M68/b0dObvPyF0fCpw1A+VOEQKcNkLEOVpO7CQN+RKGYizyH4DCc45mgrsPbDU61575SxpjWr2x3ZPekuo+FJaQKmuqMQhUkDYPJBZBRSHVIQ2Dr9/4TgHlnpN5C5pjbr8UgC6epN+x2LrYKiNo66oxCFSSPqGXOSml3497hJH0TiDxFEsvyVThUtaozb2ws23DK6DUWScRyqlqqp6ohKHSAHp7rhRqoDiELaNoymR3r9wnOlElG/VvlHBynvdTz0zOM4jfc7McR5ByUNVVfVFJQ6RAoYMrCsgbBVQpQyUOIrEmb55J4sUjZJFSjBtzT4D/e2BZ3npXZ/IfRIzuOwueHlLHn1xPVu1txQOTmqGEodIAUN6VRUQtgqoUgZKHEXiDNuIP9hmkvt8Zsa73rItjz62iZV7H5b/RE3N8NALALxnjwmF31RqhhKHSAFhSxzxV1WFLXH4x2KN+AO9qgqc74rPToOZF/i1vfOMqOekk+BHVxQOSmqO2jhECkiF7uYac+N4Vhz5RB3HUXQNjVmzco7zAPz2M88sfLzUJCUOkQIG7q9F2zj8Y2wljuxeUHmELRkVq6oakF4Ho6PDlzDAP3Z0aB2MOqaqKpFCBkocYds44p6rKmwbR5HG8YGqqhA9obQORsMJnTjM7LfAd5xzwybzN7NdgDnOuVPLGZxI3AYax0P2Vop9rqoidQhhG/FdyPMN0DoYDSVKVdWngPF5XhsHnFxyNCIjzEDjeJH9RkrjePGS0dD98xnojquxF5JD1DaOfN+2NwFdJcYiMuIMzlVVnm6ulVLuklHoNg5pSAWrqszsRODEjE1zzezVrN1GAYcC95U5NpHYDY7jKCzuuapcyBLCQCN+kcwRpjuuNK5ibRw745MC+P+H9gd6svbpAe4GvlbWyERGgPDjOOIdOR62hBC2jSN0d1xpSAUTh3PuAuACADNbAZzgnMu/8LJInYk6cjzXOLhqCD0AMORcVZF6VUnDCd3G4ZzbNc6kYWY7mdmfzGydma03s2vMbOe44pHG4EK2jsc/jiMdR9gBgMUmOQz2V4lDcojUOG5mO5jZj8xssZmtMLN9g+1fNLN3ViZEMLMO4DZgT3zvrU8AuwO3m9kWlXpfkbSiVVWJeOeqGpjFN/RkjIX3S4YswUhjijKOYx/gTiAJ3AO8HWgNXt4FOBD4aLkDDHwWmATs4Zx7MojnYWA58B/Ajyr0vtLgaqU7btgBgOFHjheeVl0aW5QSx/nAY8CuwL8x9P+lu4FKjv45HuhMJw2AYCDiXcAHK/i+0uAGf8mXZw6oSgk/yWG0FQCLfW5pTFESxyHAuc65Nxg+nuNl/FiOStkHWJJj+1Jg7wq+rzS4sCWO2Ns4opY4ijTip7vjqleV5BIlcRT6qo2jsgMAxwJrc2xfA2xTwfeVBhd1YF18a45HbeMIV1WlvCG5REkc/wROyfPaDHy1USXl+qbn/Vqb2WlBI/7i1atXVzAsqWeDJY4KVlV1dsLMmTBlin8MlluNInwbR7Q1x9WrSnKJkji+A3zAzG7G92pywHvN7FL86PLvVSC+tLX4Uke2bchdEsE5N985N9U5N3X8+HxTbIkUFra30maPHJ8zB448Eq66Cu6/3y+KdOSRfnuUOEOWOMI2jodNRNKYoozj+DtwAr5x/Lf4X/vn4keWn+Ccu7cSAQaW4ts5su0NPFrB95UGF33N8Qgn7+yEefNg06bBN0ql/PN58yKVPKKOHA8/yWHoEKSBRFqPwzl3A3CDme0GTABec849UZHIhloIzDOzSc65pwHMbCJwMPDVKry/NLhyzToLsL67j9/cuYLua/4BB87IUwlrcOk/YO0Yxna0ctq7JxXs4RS2jSNd9TQkwXV2+rU0li2DyZNh1ixSblzwuZQ5ZLjNWsgp6Bb7ZNEdy+d/gTOA68zsm/j/1b4DPAf8qopxSIMJXwUUfgDgncte5YJFy2kdvQd2wG4FTpog9X8r6Es6jtp7OyaN3zLvroON2eES3ECV2pw5vnTT1eWDf/BBWLiQ1BfPB3ZS4pCcogwA/GSBl1PAOuAB59yqkqPK4pzbaGZHAD8GLsdXky0Cvhh0DxapiPADAMNPcrixtx+A2164nh2vvDh339hEAk46iZu/8SNOu/w+NvUmiwQ6NI7icTK0qiwtqCpL3XU3TDtJ3XElpygljkvIPct05raUmV0FnOKc6y09vIw3ce5Z4N/LeU6RYga744acrjxEiaO7zyeBUZ89Fa69cuiNO629Hc48k1GtTUOOyScVui3GP55xxf20r3kVPn5Bzv1eHzUaUBuH5BYlcRwM/B64HvgTftDfdviuuO8HTgf2BeYCK4GvlzVSkRhUosTRFZQeRh10IMye7X/1d3f7X/uJhE8as2fDtGmMemaNP6Zo4gjXmP3WHbfm49N2ZmNPEm64F9asybvvbu0pzI4r+nmk8URJHLOBPzjnMhPCMuBOM9sAnOacO9HMtgI+hhKH1IHQjc65uuPmaHRm2rSBJNDe3ARz58L06X6/5cth990H9gNob/Eljq7sqqqsc6c+8vkgzsKBbtnWzHdP2M8/+fMP4MYFBavKRHKJkjiOAn6R57Xb8I3XAP8AvlJKUCIjRdiqqmEDAPM0OjN7Nl3v/AjtLYnBwXXTpg0kimzpqqohJY4c53YPPQfHfy1aY/asWT6mAlVlIrlEGQDYC0zJ89qU4PX0OTeWEpTIiBFyQN+QuaqKjM/oeu55RgUliWI6WrNKHHnO7Xr8wpyJJbmmdMtj2jRfJdbRMbjCUyLhnwdVZSK5RClx/BG/5ngS38bxCn4sx4eBs/GDAsEvL1uNsR0iFecoXk0FvkRiBpfds5Ib1r4CM+eRb5acl59czxZjx4R6/3SC+cmty7n0npWw8hmYce6w/Ta0dfiz/+FKmH5IqHMDRavKRHKJkji+DIwGfhj8ZboC+K/g30vw63WI1DznijeMp515xO4sf2UD3Ho/rH097367JzfwzhNnhDrnmFEtnHLwRFatDeYQfehlWJf73O98bgm7r74vZLQZClSVieQSOnE457qAj5vZt4F3AtsDLwL3OueWZex3Q9mjFImJw4Vek+JLR032//jLPLi+SKPzu8I1A5oZcz6QMdvOjefDdWrQlniFauMws1Yzu9/MjnbOLXPOXe6c+2HwuKz4GURqU5QSx4BZs3zjci6lNjpX8twiIYVKHMFgvl2B/sqGIzKyhG3jGKKSjc5q0JYRIEobxy3A0fiutyINwZc4NmP4dCUbndWgLTGLkjh+BvzOzJqBP+PbN4Z0G0nPXCtSLxybU1cVqGSjsxq0JUZREsffg8cvA1/Ks0+4zukitaKEvCFSr6IkjnzLxorUrc1q4xCpc1G6415ayUBERiLn3Oa1cYjUsShTjog0HOdU4hDJFmkFQDObAMwE9gCyO5M759ynyxWYjGB5Zn2tRw61cYhki7IC4B5AJ74BfAvgVWBs8HwtfgVAqXcFZn1l7ty4oys7X+JQ6hDJFKWq6jzgn/jFmwyYDowCPgNsAk4se3QyYryyvpvnb7uL5395Mc83bcHzW47j+dHjeX6Lbf3zX17Mi7ffNXQ9ijrgpxyJOwqRkSVKVdU7gM8BPcHzhHOuH/itmY0DfgK8p7zhyUjwl4df4IwrHvBPTrko/443vc43m1fwmUMnVSewKtisKUdE6lyUxLElsMY5lzKzdcC4jNcWA98qa2QyYqx8zS/08/1Hr6P5mWfy7veNY7/AKxt68r5eq1RVJTJUlMTxDPCm4N9P4Nfh+Fvw/P3A62WLSuKTo+F7ffcYWpsTfHSLDbB0Ud6ZWeceezrJVJ1VVTlVVYlki9LGcSt++ViAHwGnmNkTZrYUmAVcXO7gpMrmzIEjj4SrroL774cFC+DII9lwZydbtbcUnZk10dpaf4kDVVWJZItS4jiLoAuuc26BmXUBJwEdwAXOufkViE8qZFNvP8tefmNww5JH4PJrYcyOkLU43XMrXmCrfd40ODPrvHnQ3e1LHomETyazZ9Pc1Fx/iUO9qkSGiZI4Hsf3nHoIwDl3PXA9gJnta2ZPO+fqp1W0zn3z2iVc88DzQzfOOCfv/ge++qL/R4GZWZu+ewvJOutVlXJOJQ6RLFESx0SgLc9r7cAuJUcjVfN6Vx+7bNvB2enV5WadCU8+lXf/Pd80GjjVP8kzM2vCjFS9lTjQyHGRbJFGjpM1jXqGqVSocdzMJgNfwHf1nQRsAP4F/I9z7qFKvGcjSDnHmFEtvGfPCX7DWINn7s+/JOk7iy9J2pww+ustcThQK4fIUAUbx83sS2b2rJk9i08a16efZ/ytBi5ksIdVuR2NTxqXAh8ATgfGA/ea2ZQKvWfdS6Ycicyf0mVYkjSRqL8SBxoAKDJMsRLH08Ci4N8n48drrM7apwd4FPh1eUMb8AfgQpcxJNnMbsN3D54FfLJC71vXUs7RlMi4IxZp+A4zF1VzwuqujUMDAEWGK5g4nHPXAdfBQM+SbzvnVlQhrswYXs2xbZ2ZLQN2qGYs9SSZcjRl/5QucUnSRJ1WVanEITJUlPU4RsxCTmY2FtgXjR3ZbKl8N8QSliRtKqVxfITOuOvQehwi2Wp1PY6f4WsQfpJvBzM7zcwWm9ni1auza9cklcqqqiqDpoRt3jiOPAMPmTOnrPFtDpU4RIareuIws/eamQvxd0ee478GfBQ4wzn3ZL73cc7Nd85Ndc5NHT9+fIU+Te1KZrdxlMFmJY7OTt+usmlTuguTb1/ZtMlv7+wsa4xRaeS4yHBRu+OWw93AXiH225S9wcw+B3wf+KZz7rflDqyRpLJ7VZVBU4TG8efWbOKfK9bArxfCpGm5O3onDH69kG223pUj9tyurLGGpZHjIsNVPXE45zbhR6FHYmafAH4BnO+c+17ZA2swKefvy+UUpcTxvRse429LX4JxB8NxBxfe+ZLF3DH7cCaO26IMUUbj8g5dEmlccZQ4IjOzE/EN4b92zs2OO556kKxEG4eFTxzru/vYb4cxXPjPS+GGG/IOPLz73z/NV7edxvruvrLGGpraOESGGfGJw8zeDVwJPAxcYmaZXW16nHMPxBNZbUu5ClVVhUwcXX1JxoxqYecvfBquucK3aWTr6GDV8UfDXevp6k2WNdawNOWIyHC10KvqCPwcWW8H7gLuyfi7Nsa4alpFShwJIxWyjaO7L0V7S9PgwMOODj/gEPxjRwfMnk372/YFfKLZbJ2dMHMmTJniH/M1uOfYzzl1xxXJNuITh3PubOec5fmbGHd8tapSJY6wAwB7+pK0twRfv7lzYdEimDHD37RnzPDP585lVEsTAN2ZiSNsIoDwXX3z7OceflglDpEsI76qSioj5fxI73Jqyp6rqsCgvu6+pC9xpOUZeJhOHAMljjlzfDfdri7f5enBB2HhQl9qmTt36MGZXX3TMrv6Tp/u37PAfu6xx7EDJpRyWUTqjhJHg/JTjpT3nE2W0R23yA2+uz81WOIoYFRrkDh6U6ESwXOT38rSF9b51+ZfAzvuT86+vpbgzsvv4fEHUvDEE3Di2Tnf/+mxO7L166+HvQQiDUGJo0GlnCt7iSORMPqTLtQNvqs3SXtzU/6TBdKJ4+r7V7Fkyb1w6Cm5x3wYuMs7uXL0a4PbJhwGJx5W8PwT3+hhx40boL8n5+v7vPIUh/atBmYWjVWkUShxNKhKDABsThhPr97I0Queg4+cl3/HBU/R1br10KqqPLZsbWbKLtuw8rVNrGzdHnYrMAtASzPbbdXGZw6ZxCG7j4OzzoKbb87b1bft6Pcy6dxfwcz58McF+dciOan4WiQijUSJo0ElXY7ZcUs0Y+pO/h+LHoFC1TupN9j9qL2Yvt+bip4zkTCu/vy7/JOZM32jdaEb/I+vGNx2+slw9WV5u/pyejBv56xZvhot134h1yIRaSRKHA0qmSp/4/h79pzgVxS8YR4sLHKD/+hXor9B1Bt82DVGyrAWiUgjGfHdcaUynHM0Veq/fhlWE8ypyJiPnDf4Al19N2s/EVGJo1ElKzCOY0Alf8FvzmJTYdcYKWEtEpFGosTRoIatOV5uJa4mWJBu8CKxUuJoUJVYyGkY3eBF6pLaOOpVkWk5Uo7KJw4RqUtKHPUoxPxMSec0B5OIbBYljnoTcinWVKr84zhEpDGojaPOPHjR75jzoe/Sl8g1KttgwXK4t4/+arRxiEhdUuKoM//ckOChyZN5z1P/oimVYw2LRC9svRc7jR3FUXvHs463iNQ2JY460zfeTwH+y2u/R1uyf+iL6VHbJ/9XDJGJSL1QG0ed6Z3m53VqSeYobWjeJREpAyWOOtP/5jfTTIpEx6jw03KIiESgqqo605d0tLS0+HmWKjFqW0QanhJHnentT9HSZBq1LSIVo6qqOtOXTNHarP+sIlI5usPUmb5kipaKzZcuIqLEUXf6kk6JQ0QqSneYWpc1mWHfK6t9G4eISIXUXOO4mc0ErgCed87tGHc8sZozx88/1dXl56V68EH6+nenZbf94o5MROpYTZU4zGxr4MfASzGHEr88kxn2OaPlpReHTaMuIlIutVbi+CHwEPAi8N6YY4nFmo29/O+dT9Oz8P/goI8NJo3AE+Mnst0br/kxHOqOKyIVUDOJw8wOBj4OvBX4ZszhxOYfy1Zz0R1P0dExiab9ds65z7HL7oZ1y6scmYg0ippIHGbWAswHznPOPWkNvI5Ef8qXMG56+a/sdMXFfq2NbOnJDEVEKqBW2jjOAtqAc+IOJG6poGrKTjnFT1qYiyYzFJEKqnriMLP3mpkL8XdHsP9uwDeAM5xz3RHe5zQzW2xmi1evXl3+D1JkTe9KSQUljsQBB/hJCzs6NJmhiFRVHFVVdwN7hdhvU/D4U+A2oDPoVQXQCljwvMc515V9sHNuPr56i6lTp7rs10uSoxssCxf6G/bcuWV9q2xB3iBh5t9r+nRNZigiVVX1xOGc2wQ8HuGQvYFdgLU5XlsLXAB8sfTIQsrsBpuWuab39OkVvXGnq6oGVn3VZIYiUmW10Dj+ESC7Mv+rwBTgw8CqagXyyvpubv3NDbjJhw3rBgv4u/lv/gq2PQe/ZRwTx21R9hhcuo2jgTsIiEi8RnzicM4Nazwws0/hq6juqGYsv/rH0/xm22lwTJFf+Ncu4bj9tufCjx1Q9hgGq6rKfmoRkVBGfOIYSTb29DMu2cVff/W5/N1gjz+eT079FF19OZZuLYN0VVWTMoeIxKQmE4dz7lNxvG93X5KO0VswIdk1tI0jraMDzjiN9vuS9CVzJJYySJc4VFUlInGplXEcI0JPf4q2LTuKdoNtThjJVHk7cqW57MZxEZEqq8kSR1x6+lO0tzQV7Qbb3GT0JyuTOAZ7VSlziEg8lDgi6O5L0pZelrVAN9iWpgRv9PdXJIZ0DZgSh4jERVVV+eQYGd7Tn6Ktpfgla6pgVdXAlCPKGyISEyWOXObMgSOPhKuugvvvhwUL4Mgj6Xl2Fe3NTUUPb04k6KtQVZVTVZWIxExVVdkyRob3NDXjCG7QPX10vbaWtvXbFD1FS5PRX+FeVeqOKyJxUeLIdsEFfg4q4P2fuoDl43YZ8vL+yx4Djip4iuamRMWrqpQ3RCQuShzZli0bmE7k0/+6jjWjthry8jGt64HCU5Y3J4y+XAMEy0DjOEQkbkoc2SZP9rPdplJ85OGbh74WcoGk5oSRrGAbh0obIhInNY5nmzWr5AWSmpsS9FWwqkoN4yISJyWObNOmlbxAUnMiR+N4mRZ+Sjn1qBKReKmqKpcSF0hqbrKBtcGBsi78lEo5jeEQkVgpceRTwgJJLU2JwSlHyrzwU8o5dcUVkVgpcVRAU8Lo7k/yucvvg38+AEd/EcjR5mEGv38Ae7KFzxy6K1N2GVv03KqqEpG4KXFUwEGTtuX2x19hxasbIdUGY9+cf2fXxvKlLzF+dFvIxKGqKhGJlxJHBbx78njePXm8fzLzIj9lSb6Fn046iXdsf2roKUqcShwiEjP1qqq0EN17W5sSoRd+Smkch4jETImj0kJ0721psoiJQ5lDROKjqqpqKNK9tyVSiUPTjYhIvJQ4qqXIwk+9/WHbOBxNKieKSIx0CxoBWprDlziSKVVViUi8lDhGgNZIbRzqVSUi8VLiGAGitXFoHIeIxEuJYwRoaUrQmz2OI8+kiBrHISJxU+P4CNDanKCvP6PEUWBSxNReJ2gch4jEqmZKHGa2g5n91sxeMrMeM1thZufEHVc5DBkAmDkpYrASYeakiKnVr6rEISKxqokSh5lNBO4CVuDXbX0ZmAjsFl9U5dPSZDzz2kaO+tHfYdUqmDmPnJMiYrz43Btst9021Q5RRGRATSQO4JfA88B7nHN9wba/xxhPWX146k70pkscj6yCta/n3Xf35AYO+fePVCcwEZEcRnziMLO3AMcAn8xIGnXl4N3GcfBu4/yTv8yD6wtPisg7v1LdAEVEMtRCG8fBwWOXmd0StG+sNbPLzGzbWCOrhDKseS4iUkm1kDjSi1n8FlgGTAfOAo4DbjKznJ/BzE4zs8Vmtnj16tXVibQcyrDmuYhIJVW9qsrM3gvcEmLXvzvnDmcwud3hnPtC8O/bzGwd8Ad8NdaN2Qc75+YD8wGmTp0abiKokaLENc9FRCopjjaOu4G9QuyXXqT7teAxO9ncHDy+nRyJo+aVsOa5iEglVT1xOOc2AY9HOGRp+tA8r4ebq0NERMqiFto4OoGXgGOztqef/6u64YiINLYR3x3XOddvZl8FLjGzXwLX4Af+fQ+4A7gtxvBERBrOiE8cAM65S80she9NdQqwBvgd8DXnXG01fIuI1DhrhPuuma0GVkY4ZBzwaoXCqUe6XtHpmkWj6xVNOa7XqwDOuexmgsZIHFGZ2WLn3NS446gVul7R6ZpFo+sVTaWvVy00jouIyAiixCEiIpEoceQ2P+4AaoyuV3S6ZtHoekVT0eulNg4REYlEJQ4REYlEiUNERCJR4gDM7Mtmdr2ZvWhmzszOjnj8CWb2gJl1m9lKM/ummTVVKNzYmVnCzL5mZs8En/khM/v3kMdeElzj7L+fVDjsijOznczsT2a2zszWm9k1ZrZzyGPbzey84DvYZWb3mNm7Kx1znEq8Xrm+Q87M9q9w2LExsx3N7GfBd2NT8Hknhjy2rN8vJQ7vs8AE4M9RDzSzY4Cr8XNmTQcuAL4JfL+M8Y003wHOBn6O/8ydwB/N7H0hj18NHJT19+Pyh1k9ZtaBn/5mT+Bk4BPA7sDtZrZFiFP8Bv89/BbwfuBF/Hoz+1ck4JiV4XoBXMLw79Gysgc7cuwGzADWAndGPLa83y/nXMP/AYngsRk/C+/ZEY59AL92SOa2bwG9wJvi/mwVuFYTgB5gbtb2RcDDIY6/BFgV9+eowHWZBSSB3TK27Qr0A18ucuzbgu/dKRnbmoEngIVxf7aRdr2CfR3w3bg/R5WvWSLj358JrsHEEMeV/fulEgfgnNusqdnNbCdgf/y8WZkuB1rwv8brzTFAK8M/8++A/cxs1+qHNCIcD3Q6555Mb3DOrQDuAj4Y4tg+4KqMY/sJFiozs7byhxu7Uq5XQ9rc+xQV+H4pcZRmn+BxSebG4H+ATcDeVY+o8vbBlziezNqeXjclzGeeYGavmlm/mS0zs7PqoE1oH7K+B4GlFL8m+wArnF+rJvvYVnwVRb0p5Xqlfd7MeoL6/tvM7NDyhVdXyv79qonZcUewscHj2hyvrc14vZ6MBV53QXk3w5qM1wt5ELgP/6VtB04EzsHXb3+mfGFW3Vhyfw/WANuUcGz69XpTyvUCX8L9C/ACsAvw3/glpY9yzt1RriDrRNm/X3WXODZjTfOS3i54zDWK0nJsG3E243oZJXxe59xPsjb91czeAL5oZj9wzi0Pc54RanOvS0nXtIaV8j36RMbTO83sOnwJ5rvAIWWIrZ6U/ftVd4mD6Gual6JQxt464/WRLOr1WgNsY2aWVerYJuP1qK4EvghMBWo1ceQrYW5D7l97mdYAubqhlnJNR7pSrtcwzrkNZnYD8OlSA6tDZf9+1V3icNHXNC9Ful5/H+Ce9Magb3UH8GiV4thsm3G9lgJtwFsY2s6RrpfenM9cqORWK5Yy2OaVaW+KX5OlwIlm1pFVD703vndedntSPSjleuWT75d1oyv790uN4yVwzj0LPAR8LOulj+N7MdxY9aAq72/4L1uuz7wk6BgQ1Ufx/8PX8vrxC4FpZjYpvSH4AXFw8FqxY1uAD2cc2wycBNzsnOspe7TxK+V6DWNmWwHHAfeWK8A6Uv7vV9x9k0fCH76K5EP4wTUOWBA8/xDQkbHfIuDJrGPfB6SAXwGHA18CuoHz4v5cFbxe5waf8cvBZ74ouAYfyNpvyPXCN2L+AzgdOBr4APDb4NiL4v5cJV6TLfC/3B7Bdyc9Hv+j4mlgy6xr0A98K+v4P+CraD4DHAn8KbjGB8T92Uba9QJmA/+L/8FxOH4A4SP4HzSHxv3ZKnzd0veli4J71eeD54dV8/sV+4UYCX/4QWkuz9/EjP3uAJ7Jcfy/BV/6HuBZ/ADAprg/VwWvVxN+dPzK4DM/DHwox35Drhe+TvvPwXHdQBdwP3AGGYObavUPX498NbAe2BB81olZ+0wkxyBTYBTwI+Cl4NrcCxwe92caidcL/4PjLvzSpn3Aa/hf1QfG/ZmqcM3y3afuqOb3S9Oqi4hIJGrjEBGRSJQ4REQkEiUOERGJRIlDREQiUeIQEZFIlDhERCQSJQ6RLGZ2qpktN7NeM3u9zOeeaGZnZ46YFqk1ShwiGczszcB8/OSPRwDvLfNbTATmAEocUrPqbpJDkRLtjh8Zf6lz7v/iDiYsM2tz9TmnlYxAKnGIBMzsEvw0KQCLzMwF2zCzz5rZQ2bWHaxe+BszG5t1/Blmdo+ZrTGz182s08yOy3j9cOD24OktwfldsJ3g32dnnXNisP1TmXGa2SozO8jM7jazLuCHwWvjzOwiM3s+WB3vcTM7rTxXSMRTiUNk0HfwqxP+FPgCfh6t1WZ2LvBfwfb/BnbALxi0r5m9yzmXDI6fCPwaeAb//9YHgL+Y2fucczcG5/sCcCFwJoOzAW/ONOJj8BPXzQO+DnQFM8TehZ+X6GxgBX6N+IuCEsnPNuN9RIZR4hAJOOeeMrPHgqePOuc6g6m+/xuY65z7dnpfM1sG/B8+Ofw5OH52xusJ/OzAk4HPATc659abWTpJPOac6ywh3C2Bjzvnrst4z//Bz466nxtcSfFWM9samGNmFznn+kt4TxFAVVUixRyF///k92bWnP7Dzy66Hnh3ekczm2JmfzGzl/FTW/cFx+9Rgbj68WtuZzo2iGtFVqw3AdsyuNiWSElU4hApbELwmG+VtG0BzGwnfAnjUeA/8dPr9+Orv8IszRvVKxlVZGkTgN3wCStvrCKlUuIQKey14PFocq+FnX79WHy7wwzn3Kr0i2bWEeG9eoDWrG35bva51kN4DXgFmJXnmCcixCKSlxKHSGG34Fco3Nk5d0uB/dIJYuDXvplNxi+Fuipjv3SX2VE5zrES2Ddr23E59svnbwSlHefcKxGOE4lEiUOkgKDB/AfAz81sD+Dv+BXUdsK3X/zaOXc7cCu+auoyMzsf2B6Yi6+yymxLXBbsd6qZrcEnkieccxvwvaS+aWbfADqBQ4GZEcL9MX4d6TvN7Mf4EsYWwJ74JVU/uDnXQCSbGsdFinDOfR04Dd8QvgC4DjgLX3W1PNhnKfAxfK+mhcBXgK/i11jPPNdr+KVy34ZPQv8CpgQvnwP8PHj9z/i2kU9EiHMd8C7gr0F8N+HXdP8gg+NHREqmpWNFRCQSlThERCQSJQ4REYlEiUNERCJR4hARkUiUOEREJBIlDhERiUSJQ0REIlHiEBGRSP4fhCDrKhhzyegAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.plot(X, y, '.r', markersize=15)\n",
    "grid = np.linspace(np.min(X), np.max(X), 1000)[:,None]\n",
    "plt.plot(grid, knn.predict(grid));\n",
    "plt.xlabel('feature');\n",
    "plt.ylabel('target');"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.0"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "knn.score(X, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And with $k=10$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [],
   "source": [
    "knn = KNeighborsRegressor(n_neighbors=10, weights='uniform').fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAD9CAYAAABdoNd6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAAfoklEQVR4nO3de3hc1Xnv8e87kq2LbWIbGyjGYBswFxNuFhwnhgQQhTqU3HFxW1JykpLmhiiPexIaQOgUQps4DZyUJnETSpqkxG5IGk4uEIpjkhIrIIyBAMUQbIgDpvIFfJElj6TVP/YMjEcz0h7NrD17Zv8+z6NHntmXtWY/26/WrL3Wu8w5h4iI1LdUtSsgIiL+KdiLiCSAgr2ISAIo2IuIJICCvYhIAjRWuwLFzJgxw82ZM6fa1RARqRkzZszg3nvvvdc59wf522Ib7OfMmUNPT0+1qyEiUlPMbEah99WNIyKSAAr2IiIJoGAvIpIACvYiIgmgYC8iEgfd3bBsGSxcGPzu7q7o6RXsRUSqrbMT2tth1SpYvx5Wrw5ed3ZWrAgFexGRsHy0vru7YcUK6OuDbBbi4eHg9YoVFWvhK9iLiIThq/V9662wb1/hbf39wfYKULAXERmLz9b3xo1vnDPf8DA8++z4z51DwV5EZCw+W9/z50OqSChOpYLtFaBgLyIyFp+t744OaG4uvK25Ga68cvznzqFgLyIyFp+t70WLYPlyaG19o4xUKni9fHmwvQIU7EVExuK79d3VBfffD0uXBiN9li4NXnd1lXfeHLHNeikiEhvZ1veKFUEf/fBw0Ppubq5c63vRooq14gtRsBcRCaOrC5YsCR7GPvssHHts0OL3GKArScFeRCQsz61vnyLrszeze8zMmdmNUZUpIlJ1nnPehBVJsDezZcApUZQlIlIyXwE5gpw3YXkP9mY2FfgicLXvskRESg7cvgJyRDlvwoqiZf854Enn3J0RlCUiSVZq4PYZkCPKeROW12BvZmcBHwA+5rMcEZFxBW6fATminDdheQv2ZjYB+Cqwwjn3jK9yRESA8QVunwE5opw3Yfls2X8KaAFuCnuAmV1hZj1m1tPb2+uvZiJSf8YTuH0G5Ihy3oTlJdib2ZHAZ4DrgCYzm5p5UEvO64b845xzK51zbc65tpkzZ/qomojUq/EEbp8BOaKcN2H5atnPA5qBbwE7c34Almf+/WZPZYtIEo0ncPsOyBHkvAnLXLGvPeWcNGjFn1pg088I/gB8Hehxzu0pdo62tjbX09NT8bqJSB3r7Cyev2a0ANvdXbNpEPKZ2SPOubYR7/sI9qNUwgE3OeeuHWtfBXsRGZc6CtzjUSzYKzeOiNSXGs5f41Okwd45Z1GWJyIiAS1eIiKSAAr2IlK7YpJRshYo2ItIbYpRRslaoGAvIrUnZhkla4GCvYjUnphllKwFCvYiUntillGyFmicvYjUnKcWnMnaiUcXDvhmcMrJsPa56CtWIVecPY/Ghsq2xRXsRaTmrDjtPaw5PD36TvfUbmb1/714Lo0jUkWWR8FeRGrO7pbJnGkv8C+3fXRkHpyrroLrrqt2FcvS1Fj5HnYFexGpOXsHhjj8+GNo/uk9ic6DUwoFexGpOX37B2md2AiLzlBwD0mjcUSk5uzdP8SkJrVVS6GrJSKx0p8eYiA9POo+ewcGmTSxwk8w65yCvYjExo69+1n8t2vYlx4ac98pzRMiqFH9ULAXkdJlFwjZuDFY27VCD0Zf2dXPvvQQS9uO4PjDDiq6X0PKuOjk3yu7vCRRsBeR0mSX/tu3L5jUtGED3H332Ev/hdCfadEvOen3OPf4QypQWcnSA1oRCc9zArL+TF990wSFpkrTFRWR8DwnIOsfDFr2zRP08LXSFOxFJDzPCcj69wfBvkXBvuIU7EUkvPnzg7QEhaRSwfYyqGXvj4K9iITX0RHknymkuRmuvLKs02f77JvVZ19xGo0jIuEtWhSMulmxYmQCsuXLRwy//O/d/XzunmcYGBx9klTWpm17AGiudMpHUbAXkRJ1dcGSJaESkP3yue1895EtzJ7ewoRi3T95zj52Bge1aMJUpSnYi0jpFi0KNYlq98AgAN/76GJmTmnyXSsZhTrGRMSbPf1BsJ/ykQ/BwoWwbJkWA68Sb8HezN5vZneZ2Qtmts/MnjGzm81siq8yRSRe9ty3hsahQZq+86+wfj2sXg3t7cEsXImUz26c5cCLwF8DW4DTgBuAc83src65cE9sRCS2XutLM1Rs3H1PD9sff5rJ8yZjhWbbLlmiXPQR8hnsL3bO9ea8fsDMdgDfAM4B1ngsW0Q8+976LVy9+rHRd1rQzlE7Xxr5fna2rYJ9ZLwF+7xAn/Vw5vcsX+WKSDQ2b+8D4IaLT8TMRu7wub+D327hpFeeG7mtArNtpTRRj8Z5e+b30xGXKyIVNjA4RFNjissXzy28Q8ursOHHQWDPV4HZtlKayEbjmNks4P8C/+Gc6ymyzxVm1mNmPb29hb4YiEhcDKSHR09r4Hm2rZQmkmBvZpOBHwCDwAeL7eecW+mca3POtc2cOTOKqonIOPWng5Z9UdnZtq2tb+TTSaWC17mzbbu7gyGZGprplfduHDNrBu4G5gFvd85t8V2miPjXnx4aO2HZWLNtPS6EIgfyGuzNbAJwF3AmcL5z7gmf5YnUNE9L/fnSnx4Ol7Cs2Gzb3IVQsjQ00xufk6pSwLeBduBdzjl9NxMpprMzmGy0alXNTD7qHwzRsh+N54VQ5EA+W/a3AZcANwF7zSz3T/QWdeeIZMSkhfub3j1s2Vkk+Bbwyq4BpjSVEUI8L4QiB/IZ7Jdkfn8m85Ori2A2rYiEaeFGEOzf9+Vf8mpfuqRjlpx02PgLnD8/6KPX0MxI+JxUNcfXuUVKEve+8Bi0cIeHHa/2pbn0jNlc0jY79HHHHjp5/IV2dAQPY3O/0WRpaGbFKcWx1LdaGO0RgxZudjnAuTMmsfCoad7LA0peCEXKoxTHUr9y+8ILJeKKy3huX5OPShi/vi+70PfEiFeI6uqC+++HpUuDei5dGryOyx/iOqKWvdSvmPSFj8lHC7fEbzT70pmFvquxHGDIhVCkPGrZS/2KQV94aJVs4Y7jG83rC31H3bKXyKhlL/UrBn3hJalUC3cc32j6My37lnLGzUusKdhL/UrqaI8i32i2txzEf845Dbe7GR793QHbnt+2FyDcjFipSQr2Ur+SOtqjyDea296ylNvPeHfwYtWGgocedlCRB8VS8xTspb6NlYirHhX5RvNq8xQO3bOD77x7Hpx66ojDJk1s4BAF+7qlYC/1L2mjPYp8o+lvbmXKlBbmnr+42jWUKlAHnUg9KjC6Z99Zb6N1Vk56A+WRTxS17EXqVd43mr6vruP1TppamFksFaWWvUhC9KeHgqGVtTKzWCpKwV4kIfr2Z4K98sgnkrpxRGJm78Ag3c9vZ7jI5F+AB5/bRvfz20s67+btezlp1ptqa2axVIyCvUjM3P6fm/jCfRvH3G/W1BYWHH5Q6PMeOb2V951+RO3NLJaKULAXiZmdfWlaJjTwb3/xlqL7NKSM4w+bgpmVXkBSZxYnnIK9SMwMDA7ROrEh6HLxIakzixNOwV4kZvrTw+Ut5B1GEmcWJ5yCvUjM9A8O0RRFQrKkzSxOOA29FImZgfRQdRYRkbqmYC8SM0E3jv5rSmXpjhKJmf70kP8+e0kc9dmLROTJl17jzodeLDqfKWvTtr2cMntqJHWS5FCwF4nIqod/y7d/9SIHT5o46n5mcObc6RHVSpLCa7A3s9nAF4HfBwz4D+Aq59yLPssViaP+9BCHHdTMumvaq10VSSBvffZm1gqsAY4H/gy4DDgW+JmZTfJVrkhcDQwO09Sox2RSHT5b9n8OzAOOc849B2BmjwPPAh8B/t5j2SKxM5AepklDKqVKfDYz3gl0ZwM9gHNuE/Ag8C6P5YrE0kBUk6VECvB55y0Afl3g/SeBEz2WKxJL6saRavJ5500HdhZ4fwcwzWO5IrEUBHt140h1+G5mFBpRXDQnq5ldYWY9ZtbT29vrsVoiZRrHYt0Dg0Nq2UvV+LzzdhK07vNNo3CLH+fcSudcm3OubebMmR6rJlKGzk5ob4dVq2D9eli9Onjd2TnqYQPpYfXZS9X4HI3zJEG/fb4Tgac8liviT3c393z/5/y/pTfj8r+kbknBZ++BSYVHFr+wva+klaVEKslnsL8bWGFm85xzzwOY2RxgMfBpj+WK+HPrraw54mSenz6Lszc9euA2M9jaAG1tBQ89YloL7z39iMrUo7s7yEW/cWOwjKBy0csYzI2VqGO8Jw4mTj0G7AOuJei//xtgCnCyc27PaMe3tbW5np4eL3UTGbeFC/n4Eefz9CFzWfO1jxbcju/7trMzWGVq375g4fDcVaa6uvyWLbFnZo8450a0OLx1IDrn9gLnARuBbwLfBjYB540V6EVia/58+ia2MGl//8htUSzW3d0dBPq+Pl7PqDY8HLxesSLUg2JJJq9Pi5xzLzrn3uecO8g5N8U5927n3GafZYp41dHB3uZJtKYLBPsoFuu+9dagRV9If3+wXaQADQ0QKcWiRfQdcRSTh/YHLXkIfre2RrNY98aNFM2RPDwcrCcrUoBSHIvk6H9wHQ9+/S7SL22Fww+HP7wI5h93wD7bphzM3LPeAqmlpS3WXYmHqvPnw4YNQWDPF0U3ktQsBXuRrM5Ovv+TR7nmvI/AIZn3HtoDDz0yYtdDT54Ln7yzpHMf8FB1wwa4++7SH6p2dATH9fWN3BZFN5LULAV7EXj9weeOky8C4Aff+EsmDA0G21qaYeVKePPJQDDC8uiZk0s+9wEBOveh6pIl4Vv4ixYFfyBWrAj66IeHDxyNo+GXUoSCvQi8/uCzb0IzDcNDnLz12TemTKVScMdtcGcJLfkC5y4o+1C1lCDd1RX8gbj11tK6kSTRFOxF4PUHn30Tm2ndv+/AubHlPvj08VB10SIFdymJRuOIQPBgM5Wib0Izk/KHVZb74DNz7oL0UFUiomAvAkE3SHMzeye20Lo/r8ul3AefmXMXpIeqEhF140giPLRpBzf96CmGRksPcuU/80I/HLXz5eB1pR586qGqxICCvSTCAxv/myd+9xrnHndI8Z2Om8uhO3dy4e7Hgxw3lXzwqYeqUmUK9pIIO/vSTGudyNcvPyPE3hcAN1a+EnqoKlWkPntJhNf60kxtnVDtaohUjVr2UrN+u6OPlT9/nsHhsdN0r39xJ4dPbYmgViLxpGAvNeuHj7/MN7tfYMbkJqzoysZvOPvYGf4rJRJTCvZSs/YMpGlIGQ9/ph0LE+1FEkx99knR3Q3LlgWjTJYtq4tFLvYODDFpYoMCvUgICvZJ0NkJ7e2wahWsXw+rVwevOzurXbOy7O4fZEqzHrqKhKFgX+/qeBm7PQNpJjepJ1IkDP1PqXeVzrgYEeccl3xlHc+8srvoPn37hzh19tToKiVSwxTs611MlrEbGnbs2pcOvf/2vfvpeWEnZx0zg2MPLZ47/rzjR5kRKyKvU7CvB6MtdxeTZeyuvPNRfvTEyyUf97Fzj+atR2vIpEi5FOxr3VjL3cVkGbvf9O7h+MOmcOkZs0MfM6mpkf8192CPtRJJDgX7WhZ2ubsYZFzcMzDImXOnc/niuZGUJyIHUrCvZWEfvsYg4+Lu/kGmaOSMSNXof18tK+XhaxUzLjrn2DMwyORm3W4i1eLlf5+ZzQc+DpwLzAN2Aw8D1znnHvNRZiLF5OErwNd+8Ty/6d1TcNvQsGNo2GkClEgV+WpqXUAQ6L8BrAemAv8H+JWZLXbOPeKp3GSJycPXwaFhbvzR00ya2MCkIl01s6a2aEy8SBX5CvbfAW5z7o0+BjNbA2wGOoAPeCo3WWLy8HXvwBAAV19wHB86Sw9gReLIS7B3zm0r8N5rZrYRmOWjzMSKwcPXPfsHAZjc1BBZmSJSmsiemJnZdOAk4J+jKjMxqvXwNTOZa+/vdsJbP8mkFzfBGUdGXw8RGVOUidC+BBhwS7EdzOwKM+sxs57e3t7IKibjkJNJc89vNgMw6bq/rvlMmiL1KlTL3szOB+4LsesDzrlzChx/DfDHwIecc88VO9g5txJYCdDW1jb2WnPi1fY9A3zlgd+wfzBvtM/WV+Ch7fDWywB4eUqQzmDyrlcPnMwlIrERthvnl8AJIfYbMSzEzP4C+CxwrXPu9hLqJlX2i2e38U+/2MSU5kYaUjkLhOzeA/MXH7Dv7Fe3MmfnS7HOpCmSZKGCvXOuD/ivUk9uZpcB/wh8wTl3U6nHS3VlW/T3XvW2AxfrXrgwWASlmIgyaYpIeN767M3sPQQPY7/mnFvuqxzxZ/9QEOwbG/KW/Zs/PxjiWUjEk7lEJBwvwd7M3gbcCTwO3GFmi3J+TvNRplReOhPsJzbk3SYdHcFY/kIinMwlIuH5atmfBzQBpwEPAutyfr7vqUypsGywn5Af7LOTuVpb32jhp1LB60pN5gq7QHodLqQu4oVzLpY/CxcudFJd/7DmWXfUp37o+tODhXdYt865Sy91buHC4Pe6dcVPlt339NPH3vf6651rbXXOzDlwLpUKXl9//fj2E0kQoMcViKnmimVNrLK2tjbX09NT7WokT86qV7ec8T5umXoKz3/2HaRSNvaxxeQvsJKb0qGra2T57e2F8/20tsL99wffHMLuJ5IwZvaIc64t//0oJ1VJ3OVMlGL9etJPP0Pj0CCprhvGf87cBVayDYvcBVbyu13C5OgvZT8RARTsJatAUB5MNdA4PFg4KIdValAOm6M/Jgupi9QKBXsJFAjK+xsamTA0WF5LudSgHHZYp4Z/ipREwV4CBYJyOtXIhOGh8lrKpQblsMM6NfxTpCQK9hIoEJTT2ZZ9OS3lUoNy2GGdUQz/FKkjCvYSKBCU06lGJgyly2spjycod3UFo2mWLg3Gzy9dGrzOH7kTdj8R0dDLpHnqpV184PaHGEgPjdw40A8D+4HgntjX2MSRu15hzWG/Kz+AZod0VmmBFZGkKDb0MrLFSyQenn55F9v2DLC07QgmNxVYAHzry/Doo/DqqzB1KmdeeAa894ryC67WAisiAijYJ86u/jQA1yw5gWmTJhbY40SgPdI6iYh/6rOPkwjyvOzaF6wXO6VZf+dFkkT/4+MiP6XAhg1w992FUwoUMDTsXk9cNpqdfftpndhAY35yMxGpawr2cZA7ezUrN6XAGMv8Oedo/8JaNm8vkCemgMPfVGQopIjULQX7OAiTUmCUYL+rf5DN2/s4/4RDOP2oaWMWd/KsqeOsqIjUKgX7OCgzz0vv7n4ALj7lcN516qxK105E6oCCfRzMnx/00Q8f2Of+7MGz6bj4rxiYfjB8YW3Rw/vTwXEzpzR5rKSI1DIF+zjo6AgexublZn9k1gk8deg8zj9sAk2HHDTqKc4+dganzR67C0dEkknBPg6yKQVWrAj66IeHIZVix5sOBuBLH2unZWJDlSspIrVMwT4uurqCUTc5KQVebf8QTZv3K9CLSNnqLth/8s5H+e2OcEMQY+msT8BZwT+3bO1jWmuhWa4iIqWpu2A/uamBg1oK5HypQSe2vImzjjm42tUQkTpQd8H+5veeXO0qiIjEjubMi4gkgIJ9NUSQ8ExEJFckwd7MlpmZM7MtUZQXa52d0N4Oq1bB+vWwenXwurOz2jUTkTrmPdib2VTgi8BW32XFXm7Cs2x6hNyEZ2rhi4gnUbTsPwc8BtwbQVnxFibhmYiIB16DvZktBv4U+LjPcmpGmQnPRETGy1uwN7MJwErg886553yVU1Pmz4dUkUueSgXbRUQ88Nmy/xTQBNzssYza0tEBzUUWDmluhiuvjLY+IpIYoYK9mZ2fGU0z1s/azP7HAJ8BPuGc6w9bGTO7wsx6zKynt7d3XB9oVNUe8phNeNba+kYLP5UKXi9fPuoCJSIi5TBXrA85dyezVuDIEOfrc869aGY/BhzwJznb/hF4O7AAGHDOFXlSGWhra3M9PT0higwpf43XVCpoTYdc47WiursPSHhGR4cCvYhUhJk94pxrG/F+mGA/jsI2A0eNssutzrmrRjtHRYN9d3cwlr2vQIK01la4/34FWxGpC8WCva/cOJcC+Z3TnwYWApcA0U6uKnONVxGRWucl2DvnRnSGm9nlBN03a32UOSoNeRSRhEtGbhwNeRSRhIss2DvnLnfOHRFVeQfQkEcRSbhktOw15FFEEq7uFi8pqsAarxryKCJJkZxgD0FgV3AXkQSqr26cas+QFRGJqfoJ9loURESkqPoI9loURERkVPUR7LUoiIjIqOoj2GuGrIjIqOoj2GuGrIjIqOoj2GuGrIjIqOoj2PuaIauhnCJSJ7zks6+EceWzr+SiIHFa7EREJKRIFy+phIqvVFUKLXYiIjWqWLCvj26cStNQThGpMwr2hWgop4jUGQX7QjSUU0TqjIJ9IRrKKSJ1RsG+EC12IiJ1Jln57EuhxU5EpI4o2I9Gi52ISJ1QN46ISAIo2IuIJICCvYhIAijYV5ISp4lITCnYV4rWwBWRGPMa7M1slpndbmZbzWzAzDaZ2c0+y6wKrYErIjHnLdib2RzgIWA+cCVwAXADMOirzKpR4jQRiTmf4+y/AvwOONc5l86894DH8qpHidNEJOa8tOzN7GjgQuBLOYG+filxmojEnK9unMWZ3/vM7L5Mf/1OM/sXMzvYU5nVo8RpIhJzvoL94ZnftwMbgSXAp4CLgHvNrGC5ZnaFmfWYWU9vb6+nqnmgxGkiEnOh+uzN7HzgvhC7PuCcO4c3/oisdc59PPPvNWb2GvAdgi6en+Qf7JxbCayEYFnCMHWLDSVOE5EYC/uA9pfACSH2yy7auj3zO/8PxE8zv0+jQLCveUqcJiIxFSrYO+f6gP8q4bxPZg8tsn24hHOJiEiZfPXZdwNbgT/Iez/7+mFP5YqISAFextk75wbN7NPAHWb2FeB7wDHATcBaYI2PckVEpDBvk6qcc98ws2GCUTgfBHYA3wKuca7YDCQREfHB4hp3zawXeKGEQ2YA2zxVpx7pepVO16w0ul6lqcT12gbgnMvvQo9vsC+VmfU459qqXY9aoetVOl2z0uh6lcb39VKKYxGRBFCwFxFJgHoK9iurXYEao+tVOl2z0uh6lcbr9aqbPnsRESmunlr2IiJShIK9iEgC1GywN7Orzez/m9nLZubM7IYSj3+3mT1qZv1m9oKZXWtmDZ6qW3VmljKza8xsc+YzP2Zm7wt57B2Za5z/c4vnantnZrPN7Ltm9pqZ7TKz75nZkSGPbTazz2fuwX1mts7M3ua7ztVU5vUqdA85MzvVc7WrxsyOMLMvZe6NvsznnRPy2IreXzUb7IE/Bw4B/r3UA83sQuAughw9S4BbgWuBz1awfnHzNwRrAP8DwWfuBv7NzN4R8vhe4C15P1+sfDWjY2atBKk7jgf+DLgMOBb4mZlNCnGKrxPch9cDfwi8TLBew6leKlxlFbheAHcw8j7aWPHKxscxwFJgJ/CLEo+t7P3lnKvJHyCV+d1IkF3zhhKOfZQg937ue9cD+4HDqv3ZPFyrQ4ABoCvv/fuBx0Mcfwewpdqfw8N16QCGgGNy3psLDAJXj3HsKZn77oM57zUCzwB3V/uzxe16ZfZ1wI3V/hwRX7NUzr8/nLkGc0IcV/H7q2Zb9s65caVJNrPZwKkEeXpyfROYQNDqrTcXAhMZ+Zm/BbzZzOZGX6VYeCfQ7Zx7LvuGc24T8CDwrhDHpoFVOccOklmcx8yaKl/dqivneiXSeOMUHu6vmg32ZViQ+f3r3DczN20fcGLkNfJvAUHL/rm897PrDoT5zIeY2TYzGzSzjWb2qTp4xrGAvPsg40nGviYLgE0uWOsh/9iJBF/f60051yvro5k1qfvMbI2ZnV256tWVit9f3rJextj0zO+dBbbtzNleT6YDr7rMd8EcO3K2j2YD8AjBjdYMvAe4maC/9sOVq2bkplP4PtgBTCvj2Oz2elPO9YLgm+QPgZeAo4C/Iliu9Pedc2srVck6UfH7KxbBfhxr3JZVXOZ3odlkVuC92BnH9TLK+LzOuVvy3vqxme0BrjKzv3POPRvmPDE13utS1jWtYeXcR5flvPyFmf2A4JvCjcBZFahbPan4/RWLYE/pa9yWY7S/jFNztsdZqddrBzDNzCyvdT8tZ3up7gSuAtqAWg32xb7JTaNwqyrXDqDQkMNyrmnclXO9RnDO7TazHwEfKrdidaji91csgr0rfY3bcmT7qRcA67JvZsa+tgJPRVSPcRvH9XoSaAKO5sB++2w/63g+82jfkGrFk7zxDCfXiYx9TZ4E3mNmrXn9qicSjOrKfz5SD8q5XsUUa8EmXcXvr8Q9oHXOvQg8BvxJ3qY/JXj6/ZPIK+XfPQQ3SKHP/OvMw+lS/THBf9JaXk/4bmCRmc3LvpH5o784s22sYycAl+Qc2wj8EfBT59xAxWtbfeVcrxHM7CDgIuBXlapgHan8/VXtcahljF9tA95PMGHBAaszr98PtObsdz/wXN6x7wCGga8C5wB/CfQDn6/25/J4vf428xmvznzmL2euwcV5+x1wvQgepP0c+BhwAXAxcHvm2C9X+3OVeU0mEbSQniAYOvhOgobA88DkvGswCFyfd/x3CLovPgy0A9/NXOPTq/3Z4na9gOXAPxE0Es4hmJT1BEEj5OxqfzbP1y0bl76ciVUfzbx+e5T3V9UvRBkX8I7MhSv0Mydnv7XA5gLHvzdzow4ALxJMqmqo9ufyeL0aCGYJv5D5zI8D7y+w3wHXi6CP9t8zx/UD+4D1wCfImTBSqz8E/aJ3AbuA3ZnPOidvnzkUmLgHtAB/D2zNXJtfAedU+zPF8XoRNBIeJFg2Lw1sJ2i9nlntzxTBNSsWp9ZGeX8pxbGISAIkrs9eRCSJFOxFRBJAwV5EJAEU7EVEEkDBXkQkARTsRUQSQMFeRCQBFOxFRBLgfwDTNOudnVB4pAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.plot(X, y, '.r', markersize=15)\n",
    "plt.plot(grid, knn.predict(grid));"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8671815772959105"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "knn.score(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [],
   "source": [
    "knn = KNeighborsRegressor(n_neighbors=10, weights='distance').fit(X, y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAD9CAYAAABdoNd6AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAA4x0lEQVR4nO3deXhU1fnA8e/JTnayQ1jCkrDKGjGAioqiWNwVi9Z9a9VKrfyqrdaY1tpWsRbrUnEpbgXcxboLqChECDthCVsghITs+545vz/uJJmEJEySuclk8n6eJ8+Qe+fee+59hjdnzvIepbVGCCGEa3Pr6QIIIYQwnwR7IYToAyTYCyFEHyDBXggh+gAJ9kII0Qd49HQB2hIWFqZjYmJ6uhhCCNFrhIWF8eWXX36ptb6o5T6nDfYxMTGkpKT0dDGEEKJXUUqFtbZdmnGEEKIPkGAvhBB9gAR7IYToAyTYCyFEHyDBXgghnEFyMixYAFOnGq/JyQ49vQR7IYToaYmJMHs2rFwJW7bAO+8YvycmOuwSEuyFEMJeZtS+k5Nh8WKoqICGLMQWi/H74sUOq+FLsBdCCHuYVftesgQqKwHYHD2a5xOuocyrn7GvqsrY7wAS7IUQ4lTMrH2npTWe84u4GSyZuQDP+tqma+zf38XCGyTYCyHEqdjUvk/S1dp3XBy4GaF4c/QYJmTvx7u+ztjn5mbsdwAJ9kIIcSo2te8br0lixYQ5Tfu6WvteuBB8fKhy92RX5EimZu5t2ufjA/fd1/lz25BgL4QQp2KtfR8LDOf74VN5aK5NAO5q7TshARYtYkPcNGo8PJl+ZLtxTl9fWLTI2O8ATpsITQghnMbChbBqFZsGjWvcpAEFjql9JyXx1YCz8DtUxvRQDzhjvnFNBwV6kGAvhBCnZq19b9xc0rjpeFAE0bVlDql919Vb+CYfZk0cjPeTP3W1tK2SZhwhhLBHUhLJZ8xhYF05AD9efTusXg1JSV0+9Td7TpBbWs2lE6O7fK62SLAXQgg7HM2v4HC5hTuuOJ3IQG/WnnOlQ5pZtNa89kM60cH9uGBspANK2rpuC/ZKqS+UUlop9Xh3XVMIIRxl7b4cAGbFhXPhuChW782hsLzm1AeeYtbtV7tPsDG9gDvPHo67mzKj6EA3BXul1AJgYndcSwghOsyONAgfbM1kdFQAw8L8WDBtCDV1FpatT2//vKeYdZtbWk3ix6nERfpz3RlDTLixJqYHe6VUMPAM8FuzryWEEB3OX2NHGoTtGUVszyji6qmDUEoxZkAgPzttAC99f5CMgoq2y9HOrNuC79Zz++ubKKqs4ZlrJ+Hpbm447o6a/ZNAqtZ6eTdcSwjRl3U0f42daRD++U0awb6e/HxaU+379xePxtPNjTveSKGgteYc66zbQ/0HctX1T3LX5X9gZ+QIMoIi+W/cLC76JJO92aU8f90Uxg0McvSTOImpQy+VUmcCNyJNOEIIs9kG7ga2gXvu3JM7VK0BucrDi/fHnce8vesIqjZG2zSkQfjMbyhr9+Xy0NzR+Hs3hcxB/X158RdTufX1Tcx7dh2LLhzFnHFR+Ht7oLXmYGYB7559E69PnUeVpw8AX46a0Xj8aSWZvPqbOZw2yPxADyYGe6WUJ/ASsFhrvc+s6wghBNAsf03jhKcGDflrWgZ7axqE/0y9hL+fcwvfD5vCSx89YeyzWNiVXcqD7+9gwqAgbjtz2EmXPDM2jPd+OZ3fvbeD376zHTe1nRA/bypr6iifcR9ulnrm7V3Hw2tfw7O+jnUxk6ny9GJsbjrjz4lHDbrTlEfRGjNr9g8C/YC/2HuAUupO4E6AIUPM7awQQrgYa+Au8+rH7Nv/zV0/vc+tm1cZ+9rKXxMXR+Wu3bw+9RLAqHm/OfliZh/YyNdx03ny9FsJ9vHkxV9MbbNNfcKgYD677yw2phew/mA+OSVV+Hp5MLI8h3MW3sjAnIzG91625zvjH76+cN8bDr39UzEl2CulhgAPA7cD3kopb5vd3tZO21Ktdb3tcVrrpcBSgPj4eG1G2YQQLiouDrZtY0dULCcCQvnT+Xc2Bfu28tcsXMhzOb5kB4Tx5spHeOmMq/jjnLv545y7AUgI8+Aft09nYHC/di/t5qZIGB5KwvBQm61jYc8tRhNSVZXxB8fNzUiv4MCcN/Yyq2Y/HPAB3mpl3yLrz2Rgm0nXF0L0Ndb8NdsHxDZuygoIZUBpfpv5a34IHcmLp1/BlXu+46yjO5iZvp3vRsSTGTqQMXNmMuWBRSjVhbHvSUlGX8GSJcY3i9hYh+e8sZdZwX4bcG4r29di/AF4FThg0rWFEH2RNX/Ntt1NjQJfxs3g5j2rW61JbziYz51vpjAyMoA/z7sCfDNw27+fc2NHODYgJyT0SHBvSWndfa0lSikN/EVr/cip3hsfH69TUlK6oVRCCFdRW29hSuIXXFy4n7R6b3L9+/Pl5UPwO3N643vqLZpl69P52+d7iAn14+3bzyAi0KcHS+1YSqnNWuv4ltsl66UQwmVsSi+gtE5z7q+u5UpfTxa8nMytqbBwQB7B/bzYfKSAN5OPkHaijPPHRPLU1RPo7+fV08XuFt0a7LXW5iV+EEL0ee+lHCPA24Oz48Lw9fLg6fkT+eNHqVz3clPa4FGRAbxw/RTmjo/qWnt8LyM1eyGESyiqqOHTnVnMjx+Mr5cR2q6YPIjzx0Sy8XABVbUWRkX5MzIioIdL2jMk2Asheq/kZGOkS1oaL067lpqgcfwiYWiztwT4eDJ7jHmpg3sLyWcvhOidbPLgHDiSw3/8Yrlyz7eMenFxT5fMKUnNXgjR+1jz4NRXVrFtwCh+d/FCAqvLeXD1q/BNdet5cPo4CfZCiN5nyRIyPAP45TVPkBo1Ev/qCl55/09ElBcas1Rby4PTx0mwF0L0OmmZhfzi+iep8vTm758v4dyDKUagh7bz4PRxEuyFEL3OE5OvpN7izrtvP8iovCPNd7aVB6ePkw5aIUSvsy9kMGdn7Dg50EObeXD6Ogn2Qohe5eNtmWRVWpg0foiRKtjNGsbc3IzfeyCjZG8gzThCiF5j/4lSfv/BTk6P6c91d8yFeVOdIqNkbyDBXgjRK1TW1HP321vw9XLnueumGIuJOElGyd5Agr0Qolf4y2e72Z9Txhu3TiPShbJUdhdpsxdCOL2vUrN5K/kod549nLPjwnu6OL2SBHshhFPLLq7id+/vYHx0IIvmjOrp4vRaEuyFEB2XnAwLFsDUqcZrcrIpl9Fa8/sPdlBda2HJzyfj5SEhq7PkyQkhOsYmARlbtsA77xi/JyY6/FKrth9n7b5cFl04ihHh/g4/f18iwV4IYT9rAjIqKqBhSVOLxfh98WKH1vCLKmr40ye7mTgoiJtnxDjsvH2VBHshhP2WLIHKSgAeP/c2njz7xqZ9VVXGfgf55zf7Kaqs5a9XTsDdre+sKGUWCfZCCPulpTXW6F+ZdgUvTJ/ftM+BCciO5Jfz9k9HmB8/mLEDAx1yzr5Ogr0Qwn5xceDmRn6/pgCcFRBq/MOBCcgWf5WGh5sb958f65DzCQn2QoiOWLgQfHxIC2ta+m/joPHGPxyUgCw9r5z/7TjOzTNjiJDJUw4jwV4IYb+EBFi0iN2Dmsa7fzFqhkMTkC1bn46Hm5JOWQeTdAlCiI5JSmJL2NdEHyng/KxUlsclkLNwHhHnzOjyqcur63hv8zHmTRgoKREcTGr2QogO21rmxqQpsdyy5EHq3Nx5piDAIec9kFNGWXUdF42Pcsj5RBMJ9kKIDjmcV87x4iqmxYQQE+bH7WcNZ/nGo6zcdLT1Azow2zar2BjWGR3cz4yi92mmNeMopa4GFgDxQARwFPgAeEJrXWrWdYUQ5vp2Xw4A546KAGDRnFHsPl7Cg+/vZP3BfG6cHsPkwcG4uSn0o4nsfv1dxmbspczThxUeMWx/chUBo/dwyW2XMGNEWLNzH8mvAGBAkDThOJqZbfaLMAL8H4BjwGTgMeBcpdQMrbXFxGsLIUzy2c4sYiP8GRLqC4CXhxuv3Xw6z67ezys/HOLjbcfx9/ZgrD9k5g0jc8FTjM8+QJ5fMNkBYQwuyqao0p/lL//EXbOG89BFo1FKNZ57zIBAQv29e/IWXZKZwf4SrXWuze/fKaUKgNeBc4A1Jl5bCGGCI/nlbEov5HcXNc8+6eXhxqILR3HnrOGs2ZPDlqOF7Pg2hUIfI5/NrqiRhFQU8+Ebv2VyVhrVnl4k3fE3XvoOTosOYt6EgWQUVLD9WDEPzR3dE7fm8kwL9i0CfYNN1tdos64rhDDPOykZKAWXT2r9v3CgjyeXT47m8snR8KebsGzZikKzfuhEYgqOE11qhAXv2hr+tHE5O258mr99vpe54wfw+a4sAH522oBuu5++pLs7aGdZX/d083WFEF1UXl3HW8lHmTM2koH2dKDGxeHmplDAzCPbGwM9AG5ueMSO5O5zRnKssJLv0nJYszeHMQMCGRzia9o99GXdFuyVUtHAn4BvtNYpbbznTqVUilIqJTe3tS8GQoie8saGIxRX1vLLWSPsO8A627ZV1tm2F4yNJMDbg093ZLP1aBEzRoQ6rsCimW4J9kopf+BjoA64pa33aa2Xaq3jtdbx4eGy9JgQziK3tJrn1x7g/DERTB7S376DrLNt8fU18uaA8Woz29bT3Y3pIYr3txyjus7CtA/+Y9pCKH2d6cFeKeUDrAKGAxdqrY+ZfU0hhGP94+t9VNXW84eLx3TswKQkWL0a5s83xtnPn2/8npRk7E9MZPqKpY1vn/j+MtMWQunrTE2XoJTyBN4HpgHna613mnk9IXq15GQjH3xampE9cuFCh+Sa6aof9uexfGMGt585jOGdWS0qIaH1+7AuhDIuZFjjpsiSPOMfixfD3LlOcf+uwrSavVLKDXgbmA1cprWW72ZCtKUbl/rriPyyau5/ZxuxEf484OjFvq0LoYzOOdy4qXGJEgcvhCLMbcZ5HrgGWAyUK6USbH4GmXhdIXqXblzqryO01vzuvR0UV9by7ILJ9PNyd+wFrAuhBNZUEFJRzIJtnzftc+BCKMJgZjPOXOvrw9YfW0kYs2mFEDZL/Z2koYbbA80Zb2w4wuq9OSReMpYxA0xYLSouDrZtA4uFLf+6vvk+By6EIgym1ey11jFaa9XGz2NmXVeIk3QgEVePsFnq70hwFNXuNnWwHqrh7skq4S+f7eHcUeHm5ZW3Y2imcBzJeilcm5O2hTdjXeovx68/s+56hSdn3dy0rwdquFW19dy3fCuBPp48dc3Exrw1DmfH0EzhOBLshety0rbwk1hruD8NNpb32zbAJrh3pYbbyW80j3+6m/05Zfxj/kTCzE5IdqqhmcJhZKUq4bqctC38JNYabtbqfQD41lYZNVwfn87XcBMTjT9olZXGH7pt22DVKuN87QTS1XtO8FbyUW4/cxhnx3XTxMa2hmYKh5KavXBdNm3hJ/xDyPUNbtrnbKM9kpI4cdMdAJT1D+9aDbeT32jyy6p58P2djI4K4P8ucvAwS9HjJNgL12VtCwc44543OP3XbzXtc8LRHif8jDQEOTFxsHx552u79nyjacWjH6dSUlnLM9dOwtvDwcMsRY+TYC9cVy8b7ZFTUm28llZhsejOn8j6jSYrIJRrrvs7GweNa9rXxjeamjoLn+3K4obpQ80ZZil6nAR74bpsR3tYaSce7XGitAqA2npNXll1509k/UbzVex0Ng0ex6MX/LJpXxvfaE6UVKE1jIp0zMLhwvlIsBeurWG0h1XZtdc75WgPi0WTXVzVWKs+kFvW+ZO1GN1zKCSack/rN5w2vtHsOFYMQHR/WejbVUmwFy6vftoZjf8ufv7fTlejB8guqaK6zsL5Y4xFvA/mdCHYJyRQ+8D/sX7oRIYVZFLj4cXyyXNJGxTHS79ZzI5Bo5s1E+WXVfPYJ6mMigwgPsbO9MWi15Ghl8LllVXVNf67uLKWQU4Yz9LzygFIGB7Kv9Yc4I8fp3LD9JhOny/5hnsoenUjT5Zu5s38Sp6edRPPeXtSVKvhuR+JjfBn8Rh3Jr7xPE94jqN4wATeiPeRjlkXJjV74fJKq2sb/11cUdvOO3vO4Xwj2MeE+REbYaQRPtiFppzlG48S6OPB2S/9jaf/cReDooIpqtVMHx7KU1dPoCInn+u/zuLTbcf4YMBEbtn4EWOuvNC5ZhYLh5JgL1xeWXXzmr0zOpxbjpeHGwMCfVh26zQAvko90alzZRRU8MWubK5PGIqPpzsRAT588uszSbp0HI/MG8M1dZm899pClNbcc9lD+NZUcfeGd5xvZrFwKAn2wuWVVjl/sN+TXcLoqADc3BTRwf2YMiSYd1IyOjUEc9n6dNyU4iabZiAfT3dumhHDuIFBsGQJA/IyuXXTxwDM27uOoGrjm4XkkXddEuyFy2vZZu9stNbsyiwxArHVTTNiOJxXzvf7czt0rtKqWlZuyuBnEwYQFdTGHAPrOPzbN33Iw2te5Q9rX2va52wzi4XDSLAXLq/UphmnyAmD/bHCSooraxkf3TSZae74AQwI8uHpr9I6VLtfuSmDsuo6bjtzWNtvso7DD6ip5I5NHzbV6sEpZxYLx5BgL1xeiU2Ad8aa/a5MY4z7eJuavZeHGw9eNJqdmcUs33TUrvPU1Vv4z4/pTIsJYcKg4Lbf2MtmFgvHkGAvXF5heQ0AMaG+5HdlZqpJtmYU4eXuxqio5rNXL5s0kJkjQ3n8f3s4nFfextFNvkw9QWZRJbed1U6tHiSPfB8lwV64vIKKGgK8PRjU35ecUucL9uv25zF1aH98PJuPcVdKsfiaiXh5uPGbFVuprbe0e543k9MZ1L8f54+JPPVFJY98nyPBXri8wvIa+vt5ERHo3ZhszFnkllazJ6uEs+LCWt0/IKgff73yNLYfK+bZ1W13nB7IKSP5UAELpg3B3c3OlaUSEozsmikpXcuyKXoFCfbC5eVbg31UoE/XM0o62I8H8gA4a2TbC4VcfNoArp46iOfXHmDzkYJW37Ni41E83BTXxA8ypZyi95NgL1xeYUUNoX5eRAb6UFuvKaio6ekiNfp+fy79fT0ZN7D9tMKJl4wlKtCHP36USn2LP1YWi+aTHcc5d3QEEQFtdLyKPk+CvXB5heW19Pf1IjLQWE81u7iqh0tk0Frzw/48zowNx+0UTS8BPp48dPEYdmeVsGp7ZrN9248VcaKkmotPizKzuKKXk2AvXF5BeQ0hfp4MDDbS92YWtbGKUzdLO1FGTmk1Z41svb2+pUsmDCA2wp9X1h1G66ba/ZepJ/BwU5w32o6OWdFnSbAXLq20qpbK2nrC/L2JCfMDsGsYY3dYZ50de2asfcFeKcXNM2NIPV7ClqOFjds3HMpnypD+BPXzNKWcwjWYGuyVUoOVUu8ppYqVUiVKqQ+UUkPMvKYQthqabAYE9yPQx5Mwf6/GdMI9bd3+PEaE+zV+47DHZZOi8fJw4387sgCoqKkjNbOY04c5Yd5m4VRMC/ZKKV9gDTAauAm4AYgF1iql/My6rhC2shqCvTVPzLAwPw45QbCvqq3np8P5nBXb9iic1vh7e3B2bFhjRsxtGUXUWTTxMSFmFFO4EDNr9ncAw4HLtdYfaa0/Bi4FhgJ3mXhdIRplFRvt81GBRrAfEe7P/hOlzdq8e8KWo4VU1Vo4y84mHFtnjgwjs6iSzKJKNqcXohRMGSI1e9E+M4P9pUCy1vpAwwat9WHgR+AyE68rRKOs4iqUgkhrsB8XHURhRW2Pd9ImH8zHTcG0YR2vkZ9uPWbT4QL2ZJcQE+on7fXilMwM9uOAXa1sTwXGmnhdIRplF1cR5u+Nl4fxUT8t2kg21pB8rKckHyrgtOggAnw6HqRHRwXi6+XOtowi9p8oY6R1ZSsh2mNmsA8BClvZXgDId07RLbKKqxrb6wFGRwXg5eHGpvTWPprdo7Kmnq0ZhSSMCO3U8e5uitgIf/ZklXA4r7xxGUMh2mP20MvWGkbbnD2ilLpTKZWilErJze3Yog1CtCaruLKxvR6MFZvOGBbC92ld/HwlJ8OCBUYSsQULOrSU3+YjhdTWaxKGdy7YA4yMCOCnwwXUWbTU7IVdzAz2hRi1+5b603qNH631Uq11vNY6Pjy8Y6MUhGhJa83xoqqThjbOigtnf05Z59vtExNh9mxYuRK2bIF33jF+T0zEYtFkFlWyKb2ATekFraZU3nzE6FSNH9r5L7hxkU0BflB/306fR/QdHiaeOxWj3b6lscBuE68rBAC5ZdWUVdcRE9o8GM6KC+fxT/fwdWo2N888Re73lpKTjUW5KyoaN5V4+PDFiJl8u6ueHxM/p7i2+RfaOWMjefzy8URYv2HsySphWKhfp9rrGwwNbRq9PKCt5QeFsGFmsF8FLFZKDddaHwJQSsUAM4GHTLyuEACk5xkBuWHmbIPYyAAmDApi+cYMbpoRg1J2pgQGWLKEH8NjWTrtCqrdvehXV836IROo9vRmQEkucwr3M+mGyxnc3xeL1qSkF/LqD4eZ/9IGPr7nTIJ8PdmdVcJpg4JOfa12RB9rHORG5D13wMJfS4pi0S4zm3FeBtKBj5VSlymlLgU+BjKAl0y8rhAAjTNlh4ed3KZ93bQh7DtRyoZD+R07aVoar0+ZR0r0WOrd3MkIiuSand/w8ev3s/7FW3gq+U2uP2MoZ8eFc86oCBZdOIo3b5tGRmElz3yTRklVLUcLKhg7oP0sl+1KTCT6+qsbf/V6Z0VjM5IQbTGtZq+1LldKnQc8A7yJ0TG7GviN1rrMrOsK0eBwfjme7oqBwSc3c1w+OZolq/fz18/28vE9M0+ZdRKMhGr/Of0a1vnGcsGBZJ79ZHHzN7SxWHd8TAhXTxnEf386ynTrCJxOB3trM1KITTMSFovRrLR4McydKzV80Sozm3HQWh8FrjLzGkK05XBuOYNDfPFwP/kLrI+nOw9eNJrfrNzGS98f4lfnjGjzPDmlVfz720Ms33iUquDTmHPwJ/7vuzdOfmM7i3XfPDOGlSkZPPN1GgBxLdabtduSJVBpdCwvWfUUnpa6pn1VVcZ+CfaiFaYGeyF60qG8MoaHtZ2G6bJJA/l69wkWf7WPQf37ccnEgc32a615fX06T3+VRkVtPZdPiuZX5wxn5PMb4bMyoyZvsRivPj7tLtY9ZkAg4wYGknq8BE931Ww4aIekpYE11cNle75rvs9igf1tL10o+jYJ9sIlVdXWczC3nAvHtb2gh1KKJ6+eQG5pNQtXbCW/rJqb3LJRzz5Lfdp+Hph+Mx/5D+PsuHCSLh3HsIY/HElJRnPJkiVGcI2NhYULT1mjPs+/llQgoqwQ9+uvs+uYk8TFwbZtRmBvqY1mJCFAgr1wUXuzS6m3aMYNbH/Ui5+3B8tuPZ37lm/lsU92s+7wZuZvPcLOqDF85D+MBzas4F6v4aiwac0PTEjoWKBOTCTmfxvhgnuhptoYm79qlfFtICnJ/vMsXGgcZ9tm36CdZiQhZPES4ZJ2WnPf2DPE0dfLg6VxdTywYQWrh03lrisf5rkZP+eqnd9w7/dvoZ5e3KEZsiexdqqOP7oHgMCq8uadqh05d0KC8QfC19eoyYPx6uvbbjOSEBLshUtKzSymv68nA+2ccOT27LP8et3bvL3i4cZtD363zMjt0dDx2VnWTtVReUd47Ot/8/SnzzTt68y5k5Jg9WqYP99I1zB/vvF7R74hiD5HmnGES9p1vJjx0UH2T5iydnzOPLKdNUvvxE1rIsqLjH1d7fi06VS9ecv/mu/r7Lk72owk+jyp2QuXU1NnYV926Snb65uJi2tsFhleeJyYoqymfV3t+LQ590mkU1V0Ewn2wuWknSiltl4zProDE5cWLjQ6OFvT1Y5PM88thJ0k2AuXk3rc6Jwd35GavZkdn9KpKpyAtNkLl7Mrs4QAbw+GhHQw9W8nx8/3+LmFsIMEe+FydmYWM3ZgoF35bk5iZsendKqKHiTNOMKl1NVb2JNVwvjorqUQFsLVSLAXLuVgbjnVdZaOdc4K0QdIsBcuZVfDzFmp2QvRjAR74VJ2HS+mn6c7w1pZsESIvkyCfV+RnAwLFhjT6xcs6FquFyeWmlnC2IGBuHemc1YIFybBvi9ITDSWrVu5ErZsMTIuuuAydhaLJvV4MeMHSnu9EC1JsHd11oyLVFQ05mfpdMZFJ3c4v5zymnrGSXu9ECeRYO/qrBkXKzy9ueOKhznc32Y1pq5mc3QyDZ2zHZo5K0QfIcHe1VkzLn4fM4Wv46bz5/Nub9rnYsvY7c4ylvyLjZTOWSFakmDvCtrrfLVmXKz28AKgWbeli2Vc3JtVyohwfzxbWWBciL5O/lf0dqfqfLVmXDwWFAGA0jZrl7pYxsW0E6WMjgro6WII4ZQk2Pdm9nS+WjMuHg0bBECxT4BLZlwsrqglq7iKUVEyEkeI1kiw782sna8amHb367wx+WdN+2w7X5OSODLrQgCOhw10yWXs9p0oBZCavRBtkGDfm1k7Xwv6BZITEMqjc37VtK9F5+vRWiPBabZfCNVvvuUyNfoG+7JLABglwV6IVpkS7JVScUqpJUqpHUqpMqVUllJqlVJqohnX67Osna8ZwVEn77PpfK2qrSerpIoR4X7UWzSH88q7uaDm25tdSoCPBwPsXGBciL7GrJr9HOBc4HXgEuBuIBz4SSk11aRr9j0tOl+bsel83ZdditZwycSBjb/3VgdySnl+7QE+25mFbuinAA7mlhEb4W//AuNC9DFmLV6yAnhe2/xvVEqtAdKBhcCNJl23b7F2vmasaWquqfb0wtvTo1nn645jRQBcPima59Yc6LXB/pPtx7l/5TbqLMbH6s6zh/OHi8cAkFFQybRhIT1ZPCGcmik1e611nm2gt24rBtKAaDOu2WclJXH0+lsbfz3881tP6nzdllFMqJ8XQ0N9iYsMIPlQfk+UtEv2Zpew6N3tTBnSn40Pz2bBtMG8vO4Qe7NLqKmzkFVcyeD+/Xq6mEI4rW7roFVKhQDjgT3ddc2+4qDyI8zfmDS17d7fN+t81Vrzw4Fcpg0LQSnF7DERbDlaxMbDBV2/sIMyaWYWVbLhYH6bfQlaax79KJUAHw+ev34KEQE+PHjRaHw83Hnx24McL6rEomFwR9ecFaIP6c7ROP/CmMD5z7beoJS6UymVopRKyc3N7baC9XYHcso4f0wkwb6ebD1a1Gzf7qwSTpRUc+5oo13/V+eMwNfLnQ+3Znbtog7IpLn7eAnz/72BmX9bw4KXkzl38bdc9eJ60lsE/Q0H89mYXsCvz4slPMAbgGBfLy6fPJCPtx1ny9FCQIK9EO2xK9grpc5XSmk7fr5t4/jfA9cB92qtD7R1Ha31Uq11vNY6Pjw8vFM31Nfkl1VTUF5DbGQA02JC+C4tF4ulqQXtvc3H8HRXzLYGe18vD+aMjeSznVlU1dZ37qJtTOaqqqmjbMlz6A0bTnmKtXtzuPyFHzmUV85Dc0fz9u1n8Md5YzmYW8a1SzeQU1rV+N63fjpCqJ8X154+uNk5Gjqcv9lzAoAhEuyFaJO9HbTrgTF2vK+i5Qal1C+BJ4BHtNavdaBswg57sozO1rhIf8IDvPlq9wk2HMpn5sgw8suqeS/lGBeNH0Cov3fjMdeePoSPth3n3ZQMbpge0/GLWidzAdS6uXM4JJo7rnyEI9aMmgEfZjPzwGZunDGUGSPCTjp8X3Ypd721mVGRASy75fTGss0cGcb04aFc/sKPJH2ym+evm0J5dR1r9uZwzdTB+Hi6NztPXKQxpv6bPTl4uisiA2XYpRBtsSvYa60rgL0dPblS6gbgBeBprfVfOnq8OLXt1pE2E6KD8fJwIyLAm79+vodlt0zjofd3UlFbz8LZsc2OSRgeQvzQ/rzw7UGumjoIXy/7/ubX1VvYdbyElKowts9bxP6wIRwMHUStuycA1239nJii4xyMm8TqQH++SM3m56cP5s+Xj29MTmaxaBa9u50Abw/+YxPoG4wdGMidZw3nubUHuP/8UvZklVJVa2HehAEnlSfM35tQPy/yy2sYEuIrq1MJ0Q6zhl6ilLoC+A/witZ6kVnX6eu2Hi1ieJgfQb5GwP3TZeO5++3NxD/+DQBJl45jZETzlL9KKX530Wjmv7SBf3yVxiPzxrZ7jfyyapZ+f4j3txwjr6wGRl1EdHEOo3LTOedQCsMLMql3c2f+jq9xV8AIH6oefJglq/fz4rcHqaqt55lrJ6GU4ps9J9iZWcw/5k8krEWgb3DzzBhe/O4gH27NpKC8hgAfD+JjWh9WOSTUl/zymsa2fCFE60wJ9kqps4HlwA5gmVLKdm5+tdZ6qxnX7WvqLZpN6QVcMDaycdtF46N4567pfL8/j+nDQ5k+IrTVY6cNC+G6M4bwyg+HmTYshDnjWpmFC3yVms2id7dTXlPP+WMimDdhINMK0omcN99os2+pny/cdx8+nu6NI2ae+SaNWaPCuWLyIF5Zd5jo4H5cOnHgycdahfl7M2NEKJ/vzMbNTRE/tH+btfbIAB/rMV5tnk8IYd5onPMAb2Ay8COwwebnQ5Ou2efszCymuLKWs2Kbt4vHx4Tw2wvi2gz0DR6dN5YJg4K4f+W2xhEttrZlFPHLtzYTE+bHFwvP4qUb4rlk4kAiz51hTNry9TXSMkCbmTTvPW8kEwYFsfjLNI7kl7MxvYDrE4bgcYqc87O8KziUV86BnDKmJn/d5rDOyIoiAMK/+MSlF1IXoqvMmlT1mNZatfETY8Y1+6J1acbw1DNHntwJag8fT3eW3hBPWIA3N766kZT05mPvt2cUYdGw9IZ4YiNbJBhLSjImb82fb4yzbyOTprubYuHsWDKLKvnN4+8CcOGrT7UflBMTmfLo/Y2/jlmzqvVhnYmJhCx/HQD/zCMuu5C6EI4gWS97sdV7cxgfHXhSJ2dHRAX5sPLO6UQEeHPjaxtZ/8m6xolSuW+uwF3Rdnt4QgIsXw4pKcZrG5k0z1n+AnH5R9nqHU508QlG/PeVtoOydVjnuKOpjZtG5h09eYF06/s8qo0hmhbl5rILqQvhCBLse6mMggq2ZRTxs9Pabvu2V1SQDyvuSmBwbSk3f5fHmk0Hqd+6jR/LPQktLcA96bHOnzw5GfenF3PBPmPs/ejcI+0HZeuwTu/6usZNg4pzjH/Y5ui3vi+w2piA5VvbNC7f1RZSF8IRTBuNI8y1avtxAC6ZePKQxM6ISN3GiqX3cuO833PXFX/Av7qCQt8g7tj4Afy0AubO7VwOfGtQHp2bDkBQpU0StoagbHtea45+gEt3f8ueiGG4NyylaJuj3/q++Tu+Is83mDs3ftB0DhdbSF0IR5CafS9Ub9Gs2HSUaTEhDOrvoFmjS5bQvzCX199NJMAa6BO/eYmH177WtZqyNShfsD+ZG7b8j999/0bTvtaCsjVHP8Cznyzm61fvadpnu0C69X3e9XXc/+N/8a2tbv19QghAgn2vtHZvDhkFldw0I8ZxJ7UG5ZDKEj5ddh8vfvgEt2z+xNjXlZqyNSj71Nfy56//TVSZTcbN1oKyNUd/q2wXSLf3fUIIQIJ9r7RsfTpRgT7MGRd56jfby6ZGPaA0n7lp65v2daWm3NGgbM3Rf8phnfa+TwgBSLDvdTYfKeCHA3ncPDOmMQWBQ5hVU+5MULZzWKfd7xNCoFqsMeI04uPjdUpKSk8Xw+lc93IyaSdK+f5359qd08ZuiYnGCJmqKqPpxs3NCPSLFnU9gCYnG+3++/dDbKzxx0Vq30I4nFJqs9Y6vuV2GY3Ti3y9+wTrD+bz6Lyxjg/0YAT0uXPNCcoJCRLchehBEux7iYqaOh5blUpcpD83TB9q3oUkKAvhkqTN3pm0s8zfM1+nkVlUyeOXn+bYtnohRJ8gNXtn0dBeXllpTCratg1WrYJFi1h73T28vO4wv0gYwrRhraf6FUKI9kiwdwa2y/w1sKYUSH/5TR7QZzA6KoBHftZ+3nkhhGiLtAc4A5tl/rL9Q3lu+nzqlBt5vkFcf9kj6MpKXrh+yknL8gkhhL2kZu8MbPLB/OHCe1gzchrh5YUU+gSSGRTJRxuXMjz8qh4upBCiN5Ng7wzi4mDbNoo9+/Hj0IkAPDh3IQCjctOZFOXXk6UTQrgACfbOYOFCWLWKD8acR7WnN2+teJj1Qyfiri38fN938PE7PV1CIUQvJ8HeGSQkYHlgEW9nD2RiVhpnHtnOmRk7m2avyrh3IUQXSQetk1j987s5EDKImzxzJc+LEMLhpGbvBLTWPLf2AIND+nHpXx4H9yd6ukhCCBcjNXsnsP5gPtszivjlrBF4yOxYIYQJJLI4gefXHiAiwJurpgzq6aIIIVyUBPselnwon/UH87nz7OEyaUoIYRoJ9j3BmvBMT53Kk89+QpSPG79IMDGTpRCiz+uWYK+UWqCU0kqpY91xPaeWmAizZ8PKlawu9mCLTzgLP3sRn8f/1NMlE0K4MNODvVIqGHgGyDb7Wk7PmvDMUlFJvk8Ai8++gWEFmVyT8qmRCM0mpbEQQjhSdwy9fBLYDmQB53fD9ZzXkiUc8unP7dc9zaHQwQD86+O/46EtxlKAS5bIBCohhClMDfZKqZnAL4AJwCNmXqs3KD14hFuvSqTU25ebNn9Cv9oqfrb3B2OnxWIsBSiEECYwLdgrpTyBpcBTWusDSimzLtVrPHH6fDL8Ilm+/A9MO5bafKebm5EQTQghTGBmm/2DgDfwVxOv0Wvsyy5lZUAsN+744uRAD0YenPvu6/6CCSH6BLuCvVLqfOtomlP9fGt9/0jgYeBerXWVvYVRSt2plEpRSqXk5uZ26oba1c4ar2Z74dsD+Hp7sHBaFPj6GjV5MF59fSXhmRDCVPY246wHxtjxvoZ19Z4F1gDJ1tE4AF6Asv5erbWubHmw1nopRtMP8fHx2s6y2aedNV7NTjaWXVzFpzuyuGlGDMHzLoSL5xidsfv3Q2yskeJYAr0QwkR2BXutdQWwtwPnHQsMBQpb2VcILAF+04HzdU07a7yyeDHMnWtqsH0zOR2L1tw8I8bYkJAgwV0I0a3M6qD9OeDTYttDwFTgGqB7J1fZrPF6LDCcPRHDmHQ8jfCKItOHPFosmg+3ZDIrLpzBIb6mXEMIIU7FlGCvtT6pMVwpdTNG8823ZlyzPaUHj7Ay/jLeH3ceeyKHG+XRFqZk7mVu2nouO3KccJOuvTWjkOPFVfzfRaNMuoIQQpyay+ez/2JXNo/M+i157v2YeHwfj6x5hfHZB/hp8Hi+jJvO4+fdzmJLHbd9uZf7Zsfi7eHYZGT/25GFt4cbF4yNcuh5hRCiI7ot2Gutb+6uazV4M/kIf/xoFxNC/Xll6e+YdHhn476EjF0sXL+CA9Ejee6hF3h+7UG+S8vlPzdPIzzA22Fl+D4tl4Thofh7u/zfVSGEE3PZrJfrD+SR+PEuZo+O4J0HLmDSDVe0OuRx5G3X8c97L2DpDVM5mFPOgpeTKamqdUgZMosqOZhbzlmxYQ45nxBCdJZLBvvqunoe/mgXQ0P9+Nd1k4088UlJxpqu8+e3usbrnHFRvHpzPOl55fzu3R1tnnvd/lyufOFHpvz5a256bSP7skvbfm+aMVdgVpxZPQJCCGEflwz2720+xuG8chIvGYuvl03zSUICLF8OKSnGa4sRODNGhPHbOXF8kZrNmr0nTjrvF7uyuPG1jRSU1zBnbCS7Mou58oUfST1e3Go5Nh8pJNTPi5ER/g69PyGE6CjXCvbJyegFC3jzjW8YW1PArIKDHT7F7WcOZ3i4H09+sQ+tm+Z1ZRVXcv/K7UwaHMxnC8/ib1dN4NP7ziLAx5OFK7ZRW2856VzbjxUxYVAQkhdICNHTXCfYWxcF2fHdFvYGRHHD2v+izj/f2N4BXh5u/PLsEezNLiX5UEHj9qe/SqPeonn255Mbvy1EBfnw58vHcyCnjPc3N586UFZdx/6cMiYODu7yrQkhRFe5RrC3mSG7esTpuFnqmbv3h6YZsh3MgXPppIEE+3ry341HAcgpreLDrZlcnzDkpIlR54+JYMyAQJatT2+2feexYrRGgr0Qwim4RrC3mSG7ZsTpTM3cS3BVmbGvYYZsB/h4unPRuCjW7s2huq6eVduOU2/RXH/GyevEKqVYMG0we7NLOZDT1Fnb0I5/WnRQJ29KCCEcxzWCfVoaaI0Gbtr8Cbdv+rBpXycXBblgbCRl1XVsOlzI2n05jIoMaLOj9cJxxoSpL1ObOnUP5ZUT1M+TUD+vDl9bCCEczTWCfVwcuLmhgGt2rebC/TbNNp1cFGTasBDclDHUctPhQs6Oa3usfGSgD6OjAkg+lN+47XBuOcPD/aRzVgjhFFwj2C9caCz+0ZpOLgoS4OPJqKhAXt+QTk29hWnDQtt9/+kxIWw5UkiddVTOobwyhoX5dfi6QghhBtcI9gkJRl56By8KMsm7mqpaI3iPfeIP7Xb0ThkaTHlNPQdyyyivruNESTUjwmV8vRDCObhGsIdTzpDtsMREYt/4d+OvA5cvg9mz2xzKOSoyEIC0E2UczisHYLjU7IUQTsK1snM5alEQ61DO2Iimtn51isVOhof74aZg/4lS3K3t9ENDJdgLIZyD69TsHck6lHNkfgYA0cU2qRPaGMrp4+lOTJgf+7JLySo2hoEODG6jH0EIIbqZa9XsHcU6lDOqNJ8Hvn+Ti9LWN+1rZyjnsFA/MgorGRJShbeHG0H9PLupwEII0T6p2bfGZijnrzesJNZawwfaHco5INiH40WVZJdUMSDIR4ZdCiGchgT71nRyKOfA4H4UV9ZyOK+cyEBpwhFCOA8J9q3p5FDOgUH9AEg9XkJUkAR7IYTzkDb7tiQlGaNuliwx2uhjY40afzujfQYG92v8d7i/45Y2FEKIrpJg354ODuWMsmm66S85cYQQTkSacRwoLKApwPf3lWAvhHAeEuwdyHYJxP6+MuxSCOE8JNibJFhq9kIIJyLB3pFsEqX1T3qkwytkCSGEWSTYO4p1DdwG/T96r93EaUII0Z1MDfZKqWil1GtKqWylVLVS6rBS6q9mXrNH2KyB2yC4oqjTa+AKIYSjmRbslVIxwEYgDrgPmAM8BtSZdc0eY7MG7m2bPgLAu956m51YA1cIIRxNaa3NObFSXwAhwEytdW1Hj4+Pj9cpKSmOL5gZpk6FLVva399b7kUI0asppTZrreNbbjelZq+UGgFcCPyrM4G+17EmTmtVJ9fAFUIIRzKrGWem9bVSKfW1tb2+UCn1hlKq/cVceyMT1sAVQghHMivYD7S+vgakAXOBB4GfAV8qpVq9rlLqTqVUilIqJTc316SimcCkNXCFEMJR7MqNo5Q6H/jajrd+p7U+h6Y/It9qre+x/nuNUqoYWIHRxPN5y4O11kuBpWC02dtTNqfRicRpQgjRXexNhLYeGGPH+xrGHuZbX1v+gfjK+jqZVoJ9r+eoNXCFEMLB7Ar2WusKYG8HzpvacGgb+y0dOJcQQoguMqvNPhnIBi5qsb3h900mXVcIIUQrTMlnr7WuU0o9BCxTSv0b+AAYCfwF+BZYY8Z1hRBCtM60xUu01q8rpSwYo3BuAQqAt4Dfa7NmcgkhhGiVaTNou0oplQsc6cAhYUCeScVxRfK8Ok6eWcfI8+oYRzyvPACtdcsmdOcN9h2llEppbYqwaJ08r46TZ9Yx8rw6xuznJSmOhRCiD5BgL4QQfYArBfulPV2AXkaeV8fJM+sYeV4dY+rzcpk2eyGEEG1zpZq9EEKINkiwF0KIPqDXBnul1G+VUp8opbKUUlop9VgHj79cKbVVKVWllDqilHpEKeVuUnF7nFLKTSn1e6VUuvWetyulrrLz2GXWZ9zy558mF9t0SqnBSqn3lFLFSqkSpdQHSqkhdh7ro5R6yvoZrFRKbVBKnW12mXtSF59Xa58hrZSaZHKxe4xSapBS6l/Wz0aF9X5j7DzWoZ+vXhvsgTuACOCjjh6olLoQeB8jR89cYAnwCPCEA8vnbP6MsQbwcxj3nAy8q5S62M7jc4HpLX6ecXwxu49Syhcjdcdo4CbgBiAWWKuU8rPjFK9ifA4fBeYBWRjrNUwypcA9zAHPC2AZJ3+O0hxeWOcxEpgPFALrOnisYz9fWute+QO4WV89MLJrPtaBY7di5N633fYoUANE9fS9mfCsIoBqIKnF9tXADjuOXwYc6+n7MOG5LATqgZE224YBdcBvT3HsROvn7habbR7APmBVT9+bsz0v63s18HhP30c3PzM3m3/fbn0GMXYc5/DPV6+t2WutO5UmWSk1GJiEkafH1puAJ0at19VcCHhx8j2/BZymlBrW/UVyCpcCyVrrAw0btNaHgR+By+w4thZYaXNsHdbFeZRS3o4vbo/ryvPqkzobpzDh89Vrg30XjLO+7rLdaP3QVgBju71E5huHUbM/0GJ7w7oD9txzhFIqTylVp5RKU0o96AJ9HONo8TmwSuXUz2QccFgbaz20PNYL4+u7q+nK82rwK+ua1BVKqTVKqbMcVzyX4vDPl2lZL51YiPW1sJV9hTb7XUkIUKSt3wVtFNjsb882YDPGB80HuAL4K0Z77e2OK2a3C6H1z0EB0L8LxzbsdzVdeV5gfJP8H3AcGAr8H8ZypRdorb91VCFdhMM/X04R7Duxxm2XLmd9bW02mWplm9PpxPNSdOF+tdb/bLHpM6VUGfAbpdTftdb77TmPk+rsc+nSM+3FuvI5usHm13VKqY8xvik8DpzpgLK5Eod/vpwi2NPxNW67or2/jME2+51ZR59XAdBfKaVa1O772+zvqOXAb4B4oLcG+7a+yfWn9VqVrQKgtSGHXXmmzq4rz+skWutSpdSnwG1dLZgLcvjnyymCve74Grdd0dBOPQ7Y0LDROvbVF9jdTeXotE48r1TAGxhB83b7hnbWztxze9+QeotUmvpwbI3l1M8kFbhCKeXbol11LMaorpb9I66gK8+rLW3VYPs6h3+++lwHrdb6KLAduL7Frl9g9H5/3u2FMt8XGB+Q1u55l7VzuqOuw/hP2pvXE14FJCilhjdssP7Rn2ndd6pjPYFrbI71AK4FvtJaVzu8tD2vK8/rJEqpQOBnwE+OKqALcfznq6fHoXZh/Go8cDXGhAUNvGP9/WrA1+Z9q4EDLY69GLAALwHnAPcDVcBTPX1fJj6vv1nv8bfWe37R+gwuafG+Zs8LoyPte+BuYA5wCfCa9dgXe/q+uvhM/DBqSDsxhg5eilEROAT4t3gGdcCjLY5fgdF8cTswG3jP+oyn9PS9OdvzAhYBL2NUEs7BmJS1E6MSclZP35vJz60hLr1ojVW/sv4+qzs/Xz3+ILrwAJdZH1xrPzE27/sWSG/l+CutH9Rq4CjGpCr3nr4vE5+XO8Ys4SPWe94BXN3K+5o9L4w22o+sx1UBlcAW4F5sJoz01h+MdtH3gRKg1HqvMS3eE0MrE/eAfsA/gGzrs/kJOKen78kZnxdGJeFHjGXzaoF8jNrrtJ6+p254Zm3FqW+78/MlKY6FEKIP6HNt9kII0RdJsBdCiD5Agr0QQvQBEuyFEKIPkGAvhBB9gAR7IYToAyTYCyFEHyDBXggh+oD/B5iQPoMI6NvBAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.plot(X, y, '.r', markersize=15)\n",
    "plt.plot(grid, knn.predict(grid));"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- So, we have KNN as a new supervised learning technique in our toolbox.\n",
    "- It can be used for classification or regression (much like the other methods we've seen).\n",
    "- It works by finding the $k$ closest neighbours to a given \"query point\".\n",
    "- This fundamentally relies on a choice of distance.\n",
    "- sklearn's KNN methods use Euclidean distance by default, but you can set others."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Pros/cons of KNN for supervised learning\n",
    "\n",
    "Pros:\n",
    "\n",
    "- Easy to understand, interpret.\n",
    "- Simple hyperparameter controlling the fundamental tradeoff.\n",
    "- Can learn very complex functions given enough data.\n",
    "\n",
    "Cons:\n",
    "\n",
    "- Can be potentially be VERY slow.\n",
    "- Often not that great test accuracy."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: for regular KNN for supervised learning (not with sparse matrices), you should scale your features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Neighbours True/False (10 min)\n",
    "\n",
    "1. Our method from last class would never recommend an item if it had no common reviewers with the query item.\n",
    "2. If we transposed our item-user matrix into a user-item matrix, we could use all the same methods to find similar users to a query user.\n",
    "3. KNN classification with $k=1$ is guaranteed to get 100% training accuracy.\n",
    "4. Unlike linear regression, KNN regression needs to consult the training data when calling `predict`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Intro to NLP (5 min)\n",
    "\n",
    "- Natural Language Processing (NLP) involves extracting information from human language.\n",
    "- There are many possible NLP tasks for many purposes. Here are just a few examples:\n",
    "  - translation\n",
    "  - summarization\n",
    "  - sentiment analysis\n",
    "  - relationship extraction\n",
    "  - question answering / chatbots\n",
    "- NLP is very difficult! Some examples:"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Example: Lexical ambiguity\n",
    "\n",
    "\n",
    "<img src=\"img/lexical_ambiguity.png\" width=\"800\" height=\"800\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Example: Part-of-speech ambiguity\n",
    "\n",
    "<center><img src=\"img/pos_ambiguity.png\" width=\"800\" height=\"800\"></center>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "Example: Referential ambiguity\n",
    "\n",
    "\n",
    "<img src=\"img/referential_ambiguity.png\" width=\"800\" height=\"800\">"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- In short, you could do an entire course (or an [entire degree](https://masterdatascience.ubc.ca/programs/computational-linguistics)!) on NLP.\n",
    "- In this class we'll focus on **turning text into numeric features**.\n",
    "- We'll use the [IMDB movie review dataset](https://www.kaggle.com/utathya/imdb-review-dataset) from Kaggle, also used in Lecture 4."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>review</th>\n",
       "      <th>label</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>6105</th>\n",
       "      <td>This movie was a dismal attempt at recreating ...</td>\n",
       "      <td>neg</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9940</th>\n",
       "      <td>These days, Ridley Scott is one of the top dir...</td>\n",
       "      <td>neg</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>45421</th>\n",
       "      <td>on the contrary to the person listed above me ...</td>\n",
       "      <td>pos</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>42236</th>\n",
       "      <td>This is one of those movies that you wish you ...</td>\n",
       "      <td>pos</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>15382</th>\n",
       "      <td>It's hard for me to explain this show to my gr...</td>\n",
       "      <td>pos</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                  review label\n",
       "6105   This movie was a dismal attempt at recreating ...   neg\n",
       "9940   These days, Ridley Scott is one of the top dir...   neg\n",
       "45421  on the contrary to the person listed above me ...   pos\n",
       "42236  This is one of those movies that you wish you ...   pos\n",
       "15382  It's hard for me to explain this show to my gr...   pos"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "imdb_df = pd.read_csv('data/imdb_master.csv', index_col=0, encoding=\"ISO-8859-1\")\n",
    "imdb_df = imdb_df[imdb_df['label'].str.startswith(('pos','neg'))].drop(columns=['file', 'type'])\n",
    "df_train, df_test = train_test_split(imdb_df, random_state=123)\n",
    "df_train.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "\"These days, Ridley Scott is one of the top directors and producers and can command huge sums to helm movies--especially since he has films like ALIEN, GLADIATOR and BLADE RUNNER to his credit. So from this partial list of his credits, it's obvious he's an amazing talent. However, if you watch this very early effort that he made while in film school, you'd probably have a hard time telling that he was destined for greatness. That's because although it has some nice camera-work and style, the film is hopelessly dull and uninvolving. However, considering that it wasn't meant for general release and it was only a training ground, then I am disposed to looking at it charitably--hence the score of 4.<br /><br />By the way, this film is part of the CINEMA 16: European Shorts DVD. On this DVD are 16 shorts. Most aren't great, though because it contains THE MAN WITHOUT A HEAD, COPY SHOP, RABBIT and WASP, it's an amazing DVD for lovers of short films and well worth buying.\""
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_train.iloc[1][\"review\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Text data\n",
    "\n",
    "- How do we feed it into ML algorithms?\n",
    "- How do we represent the meaning of text?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Word counts and TF-IDF (10 min)\n",
    "\n",
    "We've seen `CountVectorizer`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import CountVectorizer"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [],
   "source": [
    "countvec = CountVectorizer()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [],
   "source": [
    "countvec.fit(df_train[\"review\"]);"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [],
   "source": [
    "X_train_counts = countvec.transform(df_train[\"review\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at the [documentation](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html).\n",
    "\n",
    "(Note: `??` or shift-tab in Jupyter)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "??CountVectorizer.fit"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[0, 0, 0, ..., 0, 0, 0]])"
      ]
     },
     "execution_count": 34,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train_counts[0].toarray()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAZAAAAD9CAYAAACSoiH8AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjMuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/d3fzzAAAACXBIWXMAAAsTAAALEwEAmpwYAAARGklEQVR4nO3df4xlZX3H8fcHF3+A2i6C1ALrskI1S/xVV4ulKiKVKApUEG35FRvE2mprbJuqJY0Wkto01VYaLSANVWg1IArGVhsBgVCwXaogS2NFfqe1ru66CCiCfPvHOWPHy52du8/c2Tt35v1KJnfmuc9zzvPMuXM+95znnLmpKiRJ2lG7TLoDkqTpZIBIkpoYIJKkJgaIJKmJASJJarJq0h3YEXvuuWetXbt20t2QpKlyww03fKeq9hr3cqcqQNauXcvGjRsn3Q1JmipJ7lyM5U4kQJLcATwI/KAv+puq+ugk+iJJajPJI5A3VNVXJ7h+SdICjDSJnmTfJGcluS7JA0kqydo56u6X5OIk25Lcm+SSJGvG2mtJ0sSNehXWAcDxwFbgmrkqJdkNuAJ4FnAKcBJwIHBlkt0Hqn8sydeSfCzJPjvcc0nSRI0aIFdX1d5V9Wrgou3UezOwDjimqj5TVZcCRwFPB94yq97Lquo5wPOBW4GLd7zrkqRJGilAquqREZd3FHB9Vd06q+3twLXA0bPK7uwfHwY+CPxSkl1H7bQkafLGfSPhQcDNQ8o3AesBkuye5GdnPXcCcHNVPTTmvkiSFtG4r8Lag26eZNAWYHX//d7Ap5I8BghwN/D6uRaY5DTgNIA1a5yLl6SlYjEu4x32ASP5yZNVt9HNfYy2sKpzgHMANmzY0PzhJWvf9bnWpgtyx/uPnMh6JWmxjfsU1la6o5BBqxl+ZCJJmlLjDpBNdPMgg9YDt4x5XZKkCRp3gFwGHJxk3UxBf8PhIf1zkqRlYuQ5kCTH9d++oH98VZLNwOaquqovOxd4G3BpktPp5kPOoJsoP3s8XZYkLQU7Mok+eAPhh/vHq4BDAarq/iSH0d3b8XG6yfPLgXdU1X0L66okaSkZOUCqKvPXgqq6Czi2uUeSpKngJxJKkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKarJrESpNcDuwJFPB94O1V9dVJ9EWS1GYiAQK8rqq2AST5NeB84HkT6oskqcFIp7CS7JvkrCTXJXkgSSVZO0fd/ZJcnGRbknuTXJJkzew6M+HRe3J79yVJkzLqHMgBwPHAVuCauSol2Q24AngWcApwEnAgcGWS3QfqXpjkHuAM4MQd77okaZJGPYV1dVXtDZDkVOCVc9R7M7AOeGZV3drXvwn4BvAW4AMzFavqhFnL+3PgyJYBSJImY6QjkKp6ZMTlHQVcPxMefdvbgWuBo+docx7wq0meMuI6JElLwLgv4z0IuHlI+SZgPUCS1UmeNuu5Y4FvA1uGLTDJaUk2Jtm4efPmMXdXktRq3Fdh7UE3TzJoC7C6/3418MkkjwceoQuP11RVDVtgVZ0DnAOwYcOGoXUkSTvfYlzGO2wnn588WXUb8MJFWK8kaSca9ymsrXRHIYNWM/zIRJI0pcYdIJvo5kEGrQduGfO6JEkTNO4AuQw4OMm6mYL+hsND+uckScvEyHMgSY7rv31B//iqJJuBzVV1VV92LvA24NIkp9PNh5wB3A2cPZ4uS5KWgh2ZRL9o4OcP949XAYcCVNX9SQ4DPgh8nG7y/HLgHVV138K6KklaSkYOkKrK/LWgqu6iu7dDkrSM+XkgkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJgaIJKmJASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklqYoBIkpoYIJKkJhMJkCTvSfL1JI8kOWYSfZAkLcykjkAuB14NXD2h9UuSFmikAEmyb5KzklyX5IEklWTtHHX3S3Jxkm1J7k1ySZI1s+tU1Zer6ptj6L8kaUJGPQI5ADge2ApcM1elJLsBVwDPAk4BTgIOBK5MsvvCuipJWkpWjVjv6qraGyDJqcAr56j3ZmAd8MyqurWvfxPwDeAtwAcW1t3ps/Zdn5vYuu94/5ETW7ek5W+kI5CqemTE5R0FXD8THn3b24FrgaN3vHuSpKVq3JPoBwE3DynfBKxvWWCS05JsTLJx8+bNC+qcJGl8xh0ge9DNkwzaAqye+SHJ6UnuAV4MfDTJPUl+btgCq+qcqtpQVRv22muvMXdXktRqMS7jrSFl+akKVWdW1b5V9biq2rP//luL0BdJ0iIZd4BspTsKGbSa4UcmkqQpNe4A2UQ3DzJoPXDLmNclSZqgcQfIZcDBSdbNFPQ3HB7SPydJWiZGvQ+EJMf1376gf3xVks3A5qq6qi87F3gbcGmS0+nmQ84A7gbOHk+XJUlLwcgBAlw08POH+8ergEMBqur+JIcBHwQ+Tjd5fjnwjqq6b2FdlSQtJSMHSFVl/lpQVXcBxzb3SJI0Ffw8EElSEwNEktTEAJEkNTFAJElNDBBJUhMDRJLUxACRJDXZkRsJNWUm9WmIfhKitDJ4BCJJamKASJKaGCCSpCYGiCSpiZPoGrtJTd6DE/jSzuQRiCSpiQEiSWpigEiSmhggkqQmBogkqYkBIklq4mW80hh46bJWIo9AJElNDBBJUhMDRJLUxACRJDUxQCRJTQwQSVITA0SS1MQAkSQ1MUAkSU0MEElSEwNEktTE/4UlSSOa1P88W6r/78wjEElSEwNEktTEAJEkNTFAJElNDBBJUhMDRJLUxACRJDUxQCRJTQwQSVKTVNWk+zCyJJuBOxub7wl8Z4zdmSYreeywsse/kscOK3v8s8f+9Kraa9wrmKoAWYgkG6tqw6T7MQkreeywsse/kscOK3v8O2PsnsKSJDUxQCRJTVZSgJwz6Q5M0EoeO6zs8a/kscPKHv+ij33FzIFIksZrJR2BSJLGyACRJDWZ6gBJsl+Si5NsS3JvkkuSrBmx7eOT/EWS/0nygyTXJXnpYvd5XJIcl+RTSe7s+//1JH+W5EkjtK05vp63E7q+YEkOnaP/3xuh7VRvd4AkX9rONvz8PG2natsn2TfJWf12eqDv69oh9VYn+WiS7yS5P8kXkzx7xHXskuTdSe5I8sMkNyY5duyD2UGjjD3JK5JckOSb/ev5m0k+kuSpI67jjjleD8eM0n5qP9I2yW7AFcCDwClAAWcCVyZ5TlXdP88izgOOBP4QuA34HeALSV5cVV9dtI6Pzx8AdwHvAe4Bng+8F3h5kl+uqkfmaX8+cPZA2X+NuY+L7XeBf5/188MjtJn27Q7w28CTB8peDHwAuGyE9uczPdv+AOB44AbgGuCVgxWShG7c+wNvB7YC76bbFzyvqu6ZZx1n0P09/XG/njcCFyV5TVX907gG0mDesQO/BTyRbt93G3Ag8D7giH4/eN8I6/kC3b5jtq+P1MOqmsov4PeAHwMHzCrbn24n8s552j6XLnDeNKtsVf9Lu2zSYxtx/HsNKTu5H9dh87Qt4MxJj2EBYz+0H8PhO9hu6rf7dsZ2Ht2bqT2W07YHdpn1/al9/9cO1Dm6L3/5rLKfAbYAH5pn+U/tf2/vGyi/HLhpCsY+bD/w0r7ub46wjjuAC1r7OM2nsI4Crq+qW2cKqup24Fq6F9R8bR8CPjmr7cPAJ+iS+3Hj7+54VdXmIcUz78b32Zl9mSJTv92HSfIE4PXAZ6tqy6T7M041/5E0dNv1v6vqylnttgGfZf59wRHAY4ELBsovAJ6dZP8d6O5YjTL2Se8HpjlADgJuHlK+CVg/Qtvbq+qBIW0fS3foOI1e1j/+5wh135rkwf7c6hVJXrKYHVskFyb5cZLvJvmHEea/lut2fx3wJODvR6y/HLb9bNvbF6xJ8sR52j4I3DpQvql/nG9fshTtyH4A4LX9a+HBJNePOv8B0x0ge9Cd6xy0BVi9gLYzz0+VJPsAfwp8sao2zlP9Arrz6IcDpwFPAa5Icuhi9nGMtgF/SXdYfxjdOezDgevmmTxcdtu9dzLwbeCfR6g77dt+mPm26/b2B3sA36v+fM6QtlP1mugvovkruvD4zAhNPks3b3QEcALwQ+DTSU4cZX1TO4neG3YXZEZolwW0XXL6d1iX0s3/vGm++lV10qwfr0lyKd07uDOBX1mUTo5RVX0F+MqsoquSXA38G93E+ulzNF1W2x0gyc/ThcFf96fjtmvat/0cFrJdl81rIskq4B/pTl0dMuLr4e0Dy/g0cD3wZzz6tN6jTPMRyFaGvztYzfB3I7Nt2U7bmeenQpLH012Bsg44oua/4uRRqur7wOeAF465eztNVf0H3ZVE2xvDstnus5xI93c86umrn7Ictj3zb9ft7Q+2AKv7K7mGtZ2K10SSmdfA4cAxVXVTy3Kq6sfARcC+SZ42X/1pDpBNdOcvB60Hbhmh7f79pcCDbX/Eo8+HLklJdgU+BbwIeHVVfW0hi2P4O7FpMt8YlsV2H3AycGNV3biAZUz7tt/evuCu2v6lrJuAxwHPGNIW5t+XLBV/C7wBeGNVXb7AZc2E6byviWkOkMuAg5Osmynob7I5hPmvhb8M2JXuypWZtqvoNsC/VNWDY+/tmPXvOC4EXgEcXVXXL2BZT6a7N+LLY+reTpdkA/ALbH8MU7/dZ+vHfBCNRx/9MqZ+29Nt132SzEwez4zrtcy/L/g83ZuHEwbKTwRu7q/sXNKSzMwHvqmqPrPAZa2i+/u4q6q+NW+DSV7nvMBrpHene8f4NbpL9Y4CbqS7meaJs+o9nW5u4E8G2n+C7tD2VLqd8MV0E0i/OOmxjTj+j/D/N08ePPC171xjp7th6lzgN+jupzil/x3+CHjJpMc14tgv7Mf9OrpJ9N+n++S1u4A9l/N2HxjLh+guS957yHPLZtsDx/VfM6/5t/Y/v6x/fhfgX4G76W4CPAL4Et3pp/0GlvUwcN5A2fv718A7+9/LR4BHgNdOwdj/qC8/b8h+4BnbGzvw6/3fw8nAy/vf3TX98t44Uv8m/Qta4C93Dd0pnHuB79NddbB2oM7a/hfy3oHyJ9Ddufut/sXzZeDQSY9pB8Z+Rz+uYV/vnWvsdO/KrqXb4T4EfJfuXdqLJj2mHRj7u4Gb6K7GeqjfcZwDPG25b/dZ49gV2Ex378ew55fNtt/O6/xLs+rsAfwdXWg8QHcj4HPnWNb5A2WPobvw4k66S3pvAo6b9LhHGTtdUM5VZ3CcP1VGFzJXAP/bvx62AV+km0sdqX/+O3dJUpNpngORJE2QASJJamKASJKaGCCSpCYGiCSpiQEiSWpigEiSmhggkqQm/wfon+PSQGBZYQAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 432x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.hist(X_train_counts[0].toarray().flatten(), log=True);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- In a few minutes we'll talk about many of these hyperparameters.\n",
    "- The ones that most simply control the number of features are `max_features`, `min_df`, and `max_df`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(37500, 91308)"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train_counts.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(37500, 32743)"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "CountVectorizer(min_df=5).fit_transform(df_train[\"review\"]).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(37500, 27650)"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "CountVectorizer(min_df=5, max_df=100).fit_transform(df_train[\"review\"]).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(37500, 1000)"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "CountVectorizer(min_df=5, max_df=100, max_features=1000).fit_transform(df_train[\"review\"]).shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(37500, 4)"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "CountVectorizer(vocabulary=[\"good\", \"bad\", \"silly\", \"horrible\"]).fit_transform(df_train[\"review\"]).shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Break (5 min)\n",
    "\n",
    "REMINDER TO RESUME RECORDING"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- `stop_words` removes common words like \"the\", \"and\" etc.\n",
    "- `ngram_range` allows you to consider pairs of words (2), etc."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {},
   "outputs": [],
   "source": [
    "vec = CountVectorizer(ngram_range=(1,2), min_df=5, max_df=100, \n",
    "                      max_features=1000, stop_words=\"english\")\n",
    "X = vec.fit_transform(df_train[\"review\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>1933</th>\n",
       "      <th>1939</th>\n",
       "      <th>1940s</th>\n",
       "      <th>1945</th>\n",
       "      <th>1955</th>\n",
       "      <th>1969</th>\n",
       "      <th>1973</th>\n",
       "      <th>1976</th>\n",
       "      <th>1978</th>\n",
       "      <th>1982</th>\n",
       "      <th>...</th>\n",
       "      <th>working class</th>\n",
       "      <th>wrenching</th>\n",
       "      <th>wretched</th>\n",
       "      <th>wright</th>\n",
       "      <th>wwe</th>\n",
       "      <th>www</th>\n",
       "      <th>yard</th>\n",
       "      <th>yawn</th>\n",
       "      <th>yep</th>\n",
       "      <th>youngest</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37495</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37496</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37497</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37498</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>37499</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>...</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>37500 rows × 1000 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       1933  1939  1940s  1945  1955  1969  1973  1976  1978  1982  ...  \\\n",
       "0         0     0      0     0     0     0     0     0     0     0  ...   \n",
       "1         0     0      0     0     0     0     0     0     0     0  ...   \n",
       "2         0     0      0     0     0     0     0     0     0     0  ...   \n",
       "3         0     0      0     0     0     0     0     0     0     0  ...   \n",
       "4         0     0      0     0     0     0     0     0     0     0  ...   \n",
       "...     ...   ...    ...   ...   ...   ...   ...   ...   ...   ...  ...   \n",
       "37495     0     0      0     0     0     0     0     0     0     0  ...   \n",
       "37496     0     0      0     0     0     0     0     0     0     0  ...   \n",
       "37497     0     0      0     0     0     0     0     0     0     0  ...   \n",
       "37498     0     0      0     0     0     0     0     0     0     0  ...   \n",
       "37499     0     0      0     0     0     0     0     0     0     0  ...   \n",
       "\n",
       "       working class  wrenching  wretched  wright  wwe  www  yard  yawn  yep  \\\n",
       "0                  0          0         0       0    0    0     0     0    0   \n",
       "1                  0          0         0       0    0    0     0     0    0   \n",
       "2                  0          0         0       0    0    0     0     0    0   \n",
       "3                  0          0         0       0    0    0     0     0    0   \n",
       "4                  0          0         0       0    0    0     1     0    0   \n",
       "...              ...        ...       ...     ...  ...  ...   ...   ...  ...   \n",
       "37495              0          0         0       0    0    0     0     0    0   \n",
       "37496              0          0         0       0    0    0     0     0    0   \n",
       "37497              0          0         0       0    0    0     0     0    0   \n",
       "37498              0          0         0       0    0    0     0     0    0   \n",
       "37499              0          0         0       0    0    0     0     0    0   \n",
       "\n",
       "       youngest  \n",
       "0             0  \n",
       "1             0  \n",
       "2             0  \n",
       "3             0  \n",
       "4             0  \n",
       "...         ...  \n",
       "37495         0  \n",
       "37496         0  \n",
       "37497         0  \n",
       "37498         0  \n",
       "37499         0  \n",
       "\n",
       "[37500 rows x 1000 columns]"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "pd.DataFrame(data=X.toarray(), columns=vec.get_feature_names())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [],
   "source": [
    "#vec.get_feature_names()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Term Frequency - Inverse Document Frequency (TF-IDF)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "- Intuition: normalize word count by the frequency of the word in the entire dataset.\n",
    "- If \"earthshattering\" appears 10 times, that is more meaningful than if \"movie\" appears 10 times."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 51,
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.feature_extraction.text import TfidfVectorizer"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The shape is the same as what we get from `CountVectorizer`, but the counts are normalized - we won't go into the details."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(37500, 1000)"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_train_tfidf = TfidfVectorizer(max_features=1000).fit_transform(df_train[\"review\"])\n",
    "X_train_tfidf.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.hist(X_train_tfidf[0].toarray().flatten(), log=True);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Let's look at a very simple case:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "corpus = [\n",
    "            \"This is the first document, the FIRST\",\n",
    "            \"This is the second document\",\n",
    "            \"This is the third document\",\n",
    "            \"This is the fourth document\"\n",
    "         ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>document</th>\n",
       "      <th>first</th>\n",
       "      <th>fourth</th>\n",
       "      <th>is</th>\n",
       "      <th>second</th>\n",
       "      <th>the</th>\n",
       "      <th>third</th>\n",
       "      <th>this</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>D1</th>\n",
       "      <td>1</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>D2</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>D3</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>D4</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    document  first  fourth  is  second  the  third  this\n",
       "D1         1      2       0   1       0    2      0     1\n",
       "D2         1      0       0   1       1    1      0     1\n",
       "D3         1      0       0   1       0    1      1     1\n",
       "D4         1      0       1   1       0    1      0     1"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vectorizer = CountVectorizer()\n",
    "X = vectorizer.fit_transform(corpus)\n",
    "header = vectorizer.get_feature_names()\n",
    "labels = ['D1', 'D2', 'D3', 'D4']\n",
    "df = pd.DataFrame(X.toarray(), columns = header, index = labels)  \n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>document</th>\n",
       "      <th>first</th>\n",
       "      <th>fourth</th>\n",
       "      <th>is</th>\n",
       "      <th>second</th>\n",
       "      <th>the</th>\n",
       "      <th>third</th>\n",
       "      <th>this</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>D1</th>\n",
       "      <td>0.214725</td>\n",
       "      <td>0.822953</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.214725</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.429451</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.214725</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>D2</th>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.691835</td>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.361028</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>D3</th>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.691835</td>\n",
       "      <td>0.361028</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>D4</th>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.691835</td>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.361028</td>\n",
       "      <td>0.000000</td>\n",
       "      <td>0.361028</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    document     first    fourth        is    second       the     third  \\\n",
       "D1  0.214725  0.822953  0.000000  0.214725  0.000000  0.429451  0.000000   \n",
       "D2  0.361028  0.000000  0.000000  0.361028  0.691835  0.361028  0.000000   \n",
       "D3  0.361028  0.000000  0.000000  0.361028  0.000000  0.361028  0.691835   \n",
       "D4  0.361028  0.000000  0.691835  0.361028  0.000000  0.361028  0.000000   \n",
       "\n",
       "        this  \n",
       "D1  0.214725  \n",
       "D2  0.361028  \n",
       "D3  0.361028  \n",
       "D4  0.361028  "
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vectorizer = TfidfVectorizer()\n",
    "X = vectorizer.fit_transform(corpus)\n",
    "header = vectorizer.get_feature_names()\n",
    "labels = ['D1', 'D2', 'D3', 'D4']\n",
    "df = pd.DataFrame(X.toarray(), columns = header, index = labels)  \n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note: there is also [`TfidfTransformer`](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfTransformer.html) which takes the word counts from `CountVectorizer` and transforms them. The output should be the same as if you used `TfidfVectorizer`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Word embeddings (15 min)\n",
    "\n",
    "- Word embeddings: \"embed\" a word in a vector space.\n",
    "- You have a bunch of feature columns, each word has a representation."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Vector space model\n",
    "\n",
    "- Model the meaning of a word by placing it into a vector space.  \n",
    "- Distances among words in the vector space indicate the relationship between them. \n",
    "\n",
    "<img src=\"img/t-SNE_word_embeddings.png\" width=\"700\" height=\"700\">\n",
    "\n",
    "(Attribution: Jurafsky and Martin 3rd edition)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Embeddings\n",
    "\n",
    "- This is actually exactly what we did in Lecture 13 with the images.\n",
    "- We took a pre-trained network, and used it as a transformer that turns images into vectors. \n",
    "- We will be doing something similar today.\n",
    "- But those were trained on a _supervised learning_ task: given an image, predict its class.\n",
    "- How do we do the same for language?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Distributional hypothesis\n",
    "\n",
    "<blockquote> \n",
    "    <p>You shall know a word by the company it keeps.</p>\n",
    "    <footer>Firth, 1957</footer>        \n",
    "</blockquote>\n",
    "\n",
    "<blockquote> \n",
    "If A and B have almost identical environments we say that they are synonyms.\n",
    "<footer>Harris, 1954</footer>    \n",
    "</blockquote>    \n",
    "\n",
    "Example: \n",
    "\n",
    "- Her **child** loves to play in the playground. \n",
    "- Her **kid** loves to play in the playground. \n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Co-occurrence matrices\n",
    "- A way to represent vectors into a vector space\n",
    "\n",
    "#### Term-document matrix\n",
    "\n",
    "- This like the transpose of `CountVectorizer`.\n",
    "- Each cell is a count of words in the document in that column. \n",
    "- You can describe a document in terms of the frequencies of different words in it. \n",
    "- You can describe a word in terms of its frequency in different documents. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>D1</th>\n",
       "      <th>D2</th>\n",
       "      <th>D3</th>\n",
       "      <th>D4</th>\n",
       "      <th>D5</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>apricot</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>big</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>data</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>information</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sugar</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>walnut</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             D1  D2  D3  D4  D5\n",
       "apricot       0   0   1   0   0\n",
       "big           1   0   1   0   1\n",
       "data          1   0   0   1   1\n",
       "information   1   0   0   1   1\n",
       "sugar         0   1   0   0   0\n",
       "walnut        0   1   0   0   0"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "corpus = ['information big data',\n",
    "        'walnut sugar',\n",
    "        'apricot big',\n",
    "        'information data',\n",
    "        'big data information']\n",
    "vectorizer = CountVectorizer()\n",
    "X = vectorizer.fit_transform(corpus)\n",
    "header = vectorizer.get_feature_names()\n",
    "labels = ['D1', 'D2', 'D3', 'D4', 'D5']\n",
    "df = pd.DataFrame(X.toarray(), columns = header, index = labels)  \n",
    "df.transpose()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- Note the similarity to the item-user matrix from last class!!\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Term-term matrix\n",
    "\n",
    "- The idea is to go through a corpus of text, keeping a count of all of the words that appear in its context within a window."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "corpus = ['information big data',\n",
    "        'walnut sugar',\n",
    "        'apricot big',\n",
    "        'information data',\n",
    "        'big data information']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can ignore the code in the next cell:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>apricot</th>\n",
       "      <th>big</th>\n",
       "      <th>data</th>\n",
       "      <th>information</th>\n",
       "      <th>sugar</th>\n",
       "      <th>walnut</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>apricot</th>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>big</th>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>data</th>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>0</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>information</th>\n",
       "      <td>0</td>\n",
       "      <td>2</td>\n",
       "      <td>3</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>sugar</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>walnut</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             apricot  big  data  information  sugar  walnut\n",
       "apricot            0    1     0            0      0       0\n",
       "big                1    0     2            2      0       0\n",
       "data               0    2     0            3      0       0\n",
       "information        0    2     3            0      0       0\n",
       "sugar              0    0     0            0      0       1\n",
       "walnut             0    0     0            0      1       0"
      ]
     },
     "execution_count": 58,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "vec = CountVectorizer(ngram_range=(1,1)) \n",
    "X = vec.fit_transform(corpus)\n",
    "X_ww = X.T@X\n",
    "X_ww.setdiag(0) \n",
    "header = vec.get_feature_names()\n",
    "ind_col = header\n",
    "df = pd.DataFrame(X_ww.toarray(), columns = header, index = ind_col)  \n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [],
   "source": [
    "df_subset = df[['big','data']]\n",
    "df_subset.iloc[2:]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Sparse vs. dense word vectors\n",
    "\n",
    "- Term-term and term-document matrices are sparse. \n",
    "- OK because there are efficient ways to deal with sparse matrices.\n",
    "\n",
    "\n",
    "#### Alternative \n",
    "- Learn short (~100 to 1000 dimensions) and dense vectors. \n",
    "- These short dense representations of words are referred to as **word embeddings**.\n",
    "- Short vectors may be easier to train with ML models (less weights to train).\n",
    "- They may generalize better."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### How can we get dense vectors?\n",
    " \n",
    "- Count-based methods\n",
    "    - Singular Value Decomposition (SVD) - beyond the scope of the course\n",
    "- Prediction-based methods\n",
    "    - [Word2Vec](https://github.com/tmikolov/word2vec)\n",
    "    - [fastText](https://fasttext.cc/)\n",
    "    - [GloVe](https://nlp.stanford.edu/projects/glove/)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Word2Vec\n",
    "\n",
    "<img src=\"img/word2vec.png\" width=\"700\" height=\"700\">\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Success of Word2Vec\n",
    "\n",
    "- Able to capture complex relationships between words.\n",
    "- Example: What is the word that is similar to **WOMAN** in the same sense as **KING** is similar to **MAN**?\n",
    "- Perform a simple algebraic operations with the vector representation of words.\n",
    "    $\\vec{X} = \\vec{\\text{KING}} − \\vec{\\text{MAN}} + \\vec{\\text{WOMAN}}$\n",
    "- Search in the vector space for the word closest to $\\vec{X}$ measured by cosine distance.\n",
    "\n",
    "<img src=\"img/word_analogies1.png\" width=\"500\" height=\"500\">\n",
    "\n",
    "(Credit: Mikolov et al. 2013)    \n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Pre-trained embeddings\n",
    "\n",
    "- These embeddings come from training complicated ML models on big data sets.\n",
    "- However, we can skip this step and use **pre-trained** embeddings.\n",
    "- Just like in the computer vision lecture, someone already did `fit` for us and published the results, we can download and just use `transform`.\n",
    "\n",
    "A number of pre-trained word embeddings are available. Some popular ones are:  \n",
    "\n",
    "- [word2vec](https://code.google.com/archive/p/word2vec/)\n",
    "    * trained on several corpora using the word2vec algorithm \n",
    "- [GloVe](https://nlp.stanford.edu/projects/glove/)\n",
    "    * trained using [the GloVe algorithm](https://nlp.stanford.edu/pubs/glove.pdf) \n",
    "    * published by Stanford University \n",
    "- [fastText pre-trained embeddings for 294 languages](https://fasttext.cc/docs/en/pretrained-vectors.html) \n",
    "    * trained using [the fastText algorithm](http://aclweb.org/anthology/Q17-1010)\n",
    "    * published by Facebook"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Load Google's pre-trained Word2Vec model:\n",
    "\n",
    "- Requires download a 1.5 GB zipped file [here](https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing) (3.6 GB unzipped)\n",
    "- Feel free to just watch the lecture if you don't want to download this huge file!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "model = KeyedVectors.load_word2vec_format('data/GoogleNews-vectors-negative300.bin', binary=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 63,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Size of vocabulary:  3000000\n"
     ]
    }
   ],
   "source": [
    "print('Size of vocabulary: ', len(model.vocab))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "The similarity between pineapple and mango is 0.668\n",
      "The similarity between pineapple and juice is 0.418\n",
      "The similarity between sun and robot is 0.029\n"
     ]
    }
   ],
   "source": [
    "word_pairs = [('pineapple','mango'), ('pineapple','juice'), ('sun','robot')]\n",
    "for pair in word_pairs: \n",
    "    print('The similarity between %s and %s is %0.3f' %(pair[0], pair[1], model.similarity(pair[0], pair[1])))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "### Finding similar words\n",
    "Given word $w$, search in the vector space for the word closest to $w$:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "# model.most_similar('mango')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 61,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('UVic', 0.7886475324630737),\n",
       " ('SFU', 0.7588527798652649),\n",
       " ('Simon_Fraser', 0.7356574535369873),\n",
       " ('UFV', 0.6880435943603516),\n",
       " ('VIU', 0.6778583526611328),\n",
       " ('Kwantlen', 0.677142858505249),\n",
       " ('UBCO', 0.6734487414360046),\n",
       " ('UPEI', 0.6731126308441162),\n",
       " ('UBC_Okanagan', 0.6709134578704834),\n",
       " ('Lakehead_University', 0.6622507572174072)]"
      ]
     },
     "execution_count": 61,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.most_similar('UBC')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 62,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[('info', 0.7363681793212891),\n",
       " ('infomation', 0.6800296306610107),\n",
       " ('infor_mation', 0.6733849048614502),\n",
       " ('informaiton', 0.6639008522033691),\n",
       " ('informa_tion', 0.660125732421875),\n",
       " ('informationon', 0.633933424949646),\n",
       " ('informationabout', 0.6320979595184326),\n",
       " ('Information', 0.6186580061912537),\n",
       " ('informaion', 0.6093292236328125),\n",
       " ('details', 0.6063089370727539)]"
      ]
     },
     "execution_count": 62,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "model.most_similar('information')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Finding the odd one out"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 64,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "UBC\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/opt/miniconda3/envs/cpsc330env/lib/python3.8/site-packages/gensim/models/keyedvectors.py:877: FutureWarning: arrays to stack must be passed as a \"sequence\" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.\n",
      "  vectors = vstack(self.word_vec(word, use_norm=True) for word in used_words).astype(REAL)\n"
     ]
    }
   ],
   "source": [
    "print(model.doesnt_match(\"sun moon earth UBC mars\".split()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "source": [
    "#### Distance between sentences"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Requires `pip install pyemd` (forgot to put this in the course environment)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 65,
   "metadata": {
    "slideshow": {
     "slide_type": "skip"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Distance between related sentences 2.2813\n",
      "Distance between unrelated sentences 3.4133\n"
     ]
    }
   ],
   "source": [
    "sentence_obama = 'Obama speaks to the media in Illinois'.lower().split()\n",
    "sentence_president = 'The president greets the press in Chicago'.lower().split()\n",
    "sentence_unrelated = 'Data science is a multidisciplinary blend of data inference, algorithmm development, and technology.'\n",
    "\n",
    "similarity = model.wmdistance(sentence_obama, sentence_president)\n",
    "print(\"Distance between related sentences {:.4f}\".format(similarity))\n",
    "\n",
    "similarity = model.wmdistance(sentence_obama, sentence_unrelated)\n",
    "print(\"Distance between unrelated sentences {:.4f}\".format(similarity))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [],
   "source": [
    "def analogy(word1, word2, word3):\n",
    "    print('%s : %s :: %s : ?' %(word1, word2, word3))\n",
    "    sim_words = model.most_similar(positive=[word3, word2], negative=[word1])\n",
    "    return pd.DataFrame(sim_words, columns=['Analogy word', 'Score'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 67,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "man : king :: woman : ?\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Analogy word</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>queen</td>\n",
       "      <td>0.711819</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>monarch</td>\n",
       "      <td>0.618967</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>princess</td>\n",
       "      <td>0.590243</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>crown_prince</td>\n",
       "      <td>0.549946</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>prince</td>\n",
       "      <td>0.537732</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>kings</td>\n",
       "      <td>0.523684</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Queen_Consort</td>\n",
       "      <td>0.523595</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>queens</td>\n",
       "      <td>0.518113</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>sultan</td>\n",
       "      <td>0.509859</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>monarchy</td>\n",
       "      <td>0.508741</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "    Analogy word     Score\n",
       "0          queen  0.711819\n",
       "1        monarch  0.618967\n",
       "2       princess  0.590243\n",
       "3   crown_prince  0.549946\n",
       "4         prince  0.537732\n",
       "5          kings  0.523684\n",
       "6  Queen_Consort  0.523595\n",
       "7         queens  0.518113\n",
       "8         sultan  0.509859\n",
       "9       monarchy  0.508741"
      ]
     },
     "execution_count": 67,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "analogy('man','king','woman')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Montreal : Canadiens :: Vancouver : ?\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Analogy word</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Canucks</td>\n",
       "      <td>0.821327</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Vancouver_Canucks</td>\n",
       "      <td>0.750401</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Calgary_Flames</td>\n",
       "      <td>0.705470</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Leafs</td>\n",
       "      <td>0.695783</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Maple_Leafs</td>\n",
       "      <td>0.691617</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Thrashers</td>\n",
       "      <td>0.687504</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>Avs</td>\n",
       "      <td>0.681716</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Sabres</td>\n",
       "      <td>0.665307</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>Blackhawks</td>\n",
       "      <td>0.664625</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Habs</td>\n",
       "      <td>0.661023</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        Analogy word     Score\n",
       "0            Canucks  0.821327\n",
       "1  Vancouver_Canucks  0.750401\n",
       "2     Calgary_Flames  0.705470\n",
       "3              Leafs  0.695783\n",
       "4        Maple_Leafs  0.691617\n",
       "5          Thrashers  0.687504\n",
       "6                Avs  0.681716\n",
       "7             Sabres  0.665307\n",
       "8         Blackhawks  0.664625\n",
       "9               Habs  0.661023"
      ]
     },
     "execution_count": 68,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "analogy('Montreal', 'Canadiens', 'Vancouver')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Microsoft : Windows :: Apple : ?\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Analogy word</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Macs</td>\n",
       "      <td>0.673568</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>iMac</td>\n",
       "      <td>0.646340</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Mac_OS</td>\n",
       "      <td>0.640714</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>iPhone</td>\n",
       "      <td>0.640588</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>iPad</td>\n",
       "      <td>0.633464</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>OS_X</td>\n",
       "      <td>0.632136</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>iBook</td>\n",
       "      <td>0.626197</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>iMacs</td>\n",
       "      <td>0.619245</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>iOS</td>\n",
       "      <td>0.617178</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Mac_mini</td>\n",
       "      <td>0.611140</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  Analogy word     Score\n",
       "0         Macs  0.673568\n",
       "1         iMac  0.646340\n",
       "2       Mac_OS  0.640714\n",
       "3       iPhone  0.640588\n",
       "4         iPad  0.633464\n",
       "5         OS_X  0.632136\n",
       "6        iBook  0.626197\n",
       "7        iMacs  0.619245\n",
       "8          iOS  0.617178\n",
       "9     Mac_mini  0.611140"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "analogy('Microsoft', 'Windows', 'Apple')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Gauss : mathematician :: Socrates : ?\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Analogy word</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>philosopher</td>\n",
       "      <td>0.540793</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Socrates_Plato</td>\n",
       "      <td>0.478897</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>philosopher_Aristotle</td>\n",
       "      <td>0.467387</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>philosopher_Socrates</td>\n",
       "      <td>0.459890</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>Aristotle</td>\n",
       "      <td>0.455209</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>Philosopher</td>\n",
       "      <td>0.452536</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>logician</td>\n",
       "      <td>0.448070</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>Nobel_laureate_José_Saramago</td>\n",
       "      <td>0.444382</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>poet</td>\n",
       "      <td>0.443180</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>Plato_Socrates</td>\n",
       "      <td>0.441025</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                   Analogy word     Score\n",
       "0                   philosopher  0.540793\n",
       "1                Socrates_Plato  0.478897\n",
       "2         philosopher_Aristotle  0.467387\n",
       "3          philosopher_Socrates  0.459890\n",
       "4                     Aristotle  0.455209\n",
       "5                   Philosopher  0.452536\n",
       "6                      logician  0.448070\n",
       "7  Nobel_laureate_José_Saramago  0.444382\n",
       "8                          poet  0.443180\n",
       "9                Plato_Socrates  0.441025"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "analogy('Gauss', 'mathematician', 'Socrates')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "#### Implicit biases and stereotypes in word embeddings\n",
    "\n",
    "- Reflect gender stereotypes present in broader society\n",
    "- They may also amplify these stereotypes because of their widespread usage \n",
    "- See [this paper](http://papers.nips.cc/paper/6228-man-is-to-computer-programmer-as-woman-is-to-homemaker-debiasing-word-embeddings.pdf) on debiasing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "man : computer_programmer :: woman : ?\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Analogy word</th>\n",
       "      <th>Score</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>homemaker</td>\n",
       "      <td>0.562712</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>housewife</td>\n",
       "      <td>0.510505</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>graphic_designer</td>\n",
       "      <td>0.505180</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>schoolteacher</td>\n",
       "      <td>0.497949</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>businesswoman</td>\n",
       "      <td>0.493489</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>5</th>\n",
       "      <td>paralegal</td>\n",
       "      <td>0.492551</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>registered_nurse</td>\n",
       "      <td>0.490797</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>7</th>\n",
       "      <td>saleswoman</td>\n",
       "      <td>0.488163</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>electrical_engineer</td>\n",
       "      <td>0.479773</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>9</th>\n",
       "      <td>mechanical_engineer</td>\n",
       "      <td>0.475540</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          Analogy word     Score\n",
       "0            homemaker  0.562712\n",
       "1            housewife  0.510505\n",
       "2     graphic_designer  0.505180\n",
       "3        schoolteacher  0.497949\n",
       "4        businesswoman  0.493489\n",
       "5            paralegal  0.492551\n",
       "6     registered_nurse  0.490797\n",
       "7           saleswoman  0.488163\n",
       "8  electrical_engineer  0.479773\n",
       "9  mechanical_engineer  0.475540"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "analogy('man', 'computer_programmer', 'woman')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Useful software package: spaCy (10 min)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In hw6 I have you use `spacy`:\n",
    "\n",
    "- Very healthy development: https://github.com/explosion/spaCy\n",
    "- Interactive lessons by Ines Montani: https://course.spacy.io/en/\n",
    "  -  We're reusing her platform! https://prog-learn.mds.ubc.ca/\n",
    "- Abstracts away a lot of the processing steps if you want it to, also customizeable "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {},
   "outputs": [],
   "source": [
    "import spacy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [],
   "source": [
    "nlp = spacy.load(\"en_core_web_md\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [],
   "source": [
    "nlp_obama = nlp('Obama speaks to the media in Illinois')\n",
    "nlp_president = nlp('The president greets the press in Chicago')\n",
    "nlp_unrelated = nlp('Data science is a multidisciplinary blend of data inference, algorithmm development, and technology.')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.8589440859597451"
      ]
     },
     "execution_count": 75,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nlp_obama.similarity(nlp_president)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.6553394660104532"
      ]
     },
     "execution_count": 76,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nlp_obama.similarity(nlp_unrelated)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- There are many other NLP tasks we mentioned above.\n",
    "- Another one is named entity recognition.\n",
    "- spaCy makes this easy as well!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Illinois,)"
      ]
     },
     "execution_count": 77,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nlp_obama.ents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(Chicago,)"
      ]
     },
     "execution_count": 78,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nlp_president.ents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 79,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "()"
      ]
     },
     "execution_count": 79,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nlp_unrelated.ents"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 80,
   "metadata": {},
   "outputs": [],
   "source": [
    "from spacy import displacy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 81,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\">Obama speaks to the media in \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Illinois\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       "</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "displacy.render(nlp_obama, style=\"ent\", jupyter=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Looking at one of the movie reviews:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Steamboat Willy was not the first cartoon to feature Mickey Mouse. The first film to star America\\'s friend was \"Plane Crazy\". \"Plane Crazy\" was released May 15th 1928 in Hollywood California,in the silent movie format. \"Steamboat Willy\" was released November 18th 1928 as a SOUND movie (it was also released July 29th 1928 as a silent film). Thus making \"Steamboat..\"the first SOUND film of Mickey but NOT the first film for the little American Mouse. While many game shows have used the question: \"What was the first appearance of Mickey Mouse?\" The true answer is \"Plane Crazy\" not \"Steamboat Willy\". These dates can be checkout on IMDb under \"release dates\".'"
      ]
     },
     "execution_count": 82,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_train.iloc[10][\"review\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\">Steamboat Willy was not the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       " cartoon to feature \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Mickey Mouse\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       ". The \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       " film to star \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    America\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       "'s friend was &quot;Plane Crazy&quot;. &quot;Plane Crazy&quot; was released \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    May 15th 1928\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " in \n",
       "<mark class=\"entity\" style=\"background: #feca74; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Hollywood California\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">GPE</span>\n",
       "</mark>\n",
       ",in the silent movie format. &quot;Steamboat Willy&quot; was released \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    November 18th 1928\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " as a SOUND movie (it was also released \n",
       "<mark class=\"entity\" style=\"background: #bfe1d9; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    July 29th 1928\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">DATE</span>\n",
       "</mark>\n",
       " as a silent film). Thus making &quot;Steamboat..&quot;the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       " SOUND film of \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Mickey\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       " but NOT the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       " film for the little \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    American Mouse\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       ". While many game shows have used the question: &quot;What was the \n",
       "<mark class=\"entity\" style=\"background: #e4e7d2; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    first\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORDINAL</span>\n",
       "</mark>\n",
       " appearance of \n",
       "<mark class=\"entity\" style=\"background: #aa9cfc; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Mickey Mouse\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">PERSON</span>\n",
       "</mark>\n",
       "?&quot; The true answer is &quot;\n",
       "<mark class=\"entity\" style=\"background: #f0d0ff; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Plane Crazy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">WORK_OF_ART</span>\n",
       "</mark>\n",
       "&quot; not &quot;\n",
       "<mark class=\"entity\" style=\"background: #f0d0ff; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Steamboat Willy\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">WORK_OF_ART</span>\n",
       "</mark>\n",
       "&quot;. These dates can be checkout on IMDb under &quot;release dates&quot;.</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "displacy.render(nlp(df_train.iloc[10][\"review\"]), jupyter=True, style='ent')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 90,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "first 28 33 ORDINAL\n",
      "Mickey Mouse 53 65 PERSON\n",
      "first 71 76 ORDINAL\n",
      "America 90 97 GPE\n",
      "May 15th 1928 153 166 DATE\n",
      "Hollywood California 170 190 GPE\n",
      "November 18th 1928 250 268 DATE\n",
      "July 29th 1928 308 322 DATE\n",
      "first 371 376 ORDINAL\n",
      "Mickey 391 397 PERSON\n",
      "first 410 415 ORDINAL\n",
      "American Mouse 436 450 ORG\n",
      "first 512 517 ORDINAL\n",
      "Mickey Mouse 532 544 PERSON\n",
      "Plane Crazy 567 578 WORK_OF_ART\n",
      "Steamboat Willy 585 600 WORK_OF_ART\n"
     ]
    }
   ],
   "source": [
    "for ent in nlp(df_train.iloc[10][\"review\"]).ents:\n",
    "    print(ent.text, ent.start_char, ent.end_char, ent.label_)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 91,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<span class=\"tex2jax_ignore\"><div class=\"entities\" style=\"line-height: 2.5; direction: ltr\">\n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " is a delicious fruit. I got my computer from \n",
       "<mark class=\"entity\" style=\"background: #7aecec; padding: 0.45em 0.6em; margin: 0 0.25em; line-height: 1; border-radius: 0.35em;\">\n",
       "    Apple\n",
       "    <span style=\"font-size: 0.8em; font-weight: bold; line-height: 1; border-radius: 0.35em; text-transform: uppercase; vertical-align: middle; margin-left: 0.5rem\">ORG</span>\n",
       "</mark>\n",
       " while I was eating an apple</div></span>"
      ],
      "text/plain": [
       "<IPython.core.display.HTML object>"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "displacy.render(nlp(\"Apple is a delicious fruit. I got my computer from Apple while I was eating an apple\"), style=\"ent\", jupyter=True)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "- All this is done using pre-trained models that involve lots of data, computation and effort.\n",
    "- In certain cases you may choose to train your own models, but often there's no need.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## General thoughts on feature engineering (5 min)\n",
    "\n",
    "- Feature engineering is the general task of coming up with good features given available input data\n",
    "- In the past this was often done \"manually\" for things like images, text, etc\n",
    "- But now a lot of this has moved to deep learning, and often (but not always!) pre-trained models\n",
    "- But you can engineer whatever features you want, e.g. # bathrooms per bedroom\n",
    "- For a super complex model, this helps with overfitting (prior knowledge)\n",
    "- For a super simple model, this helps with underfitting (also prior knowledge)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "celltoolbar": "Slideshow",
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
